Spelling suggestions: "subject:"federation learning"" "subject:"federativa learning""
81 |
Federated Learning with FEDn for Financial Market SurveillanceVoltaire Edoh, Isak January 2022 (has links)
Machine Learning (ML) is the current trend that most industries opt for to improve their business and operations. ML has also been adopted in the financial markets, where well-funded financial institutions employ the latest ML algorithms to gain an advantage on the market. The darker side of ML is the potential emergence of complex algorithmic trading schemes that are abusive and manipulative. Because of this, it is inevitable that ML will be applied to financial market surveillance in order to detect these abusive and manipulative trading strategies. Ideally, an accurate ML detection model would be developed with data from many financial institutions or trading venues. However, such ML models require vast quantities of data, which poses a problem in market surveillance where data is sensitive or limited. Data sharing between companies or countries is typically accompanied by legal and privacy concerns. By training ML models on distributed datasets, Federated Learning (FL) overcomes these issues by eliminating the need to centralise sensitive data. This thesis aimed to address these ML related issues in market surveillance by implementing and evaluating a FL model. FL enables a group of independent data-holding clients with the same intention to build a shared ML model collaboratively without compromising private data. In this work, a ML model is initially deployed in a centralised data setting and trained to detect the manipulative trading scheme known as spoofing. The LSTM-Autoencoder was the model chosen method for this task. The same model is also implemented in a federated setting but with decentralised data, using the FL framework FEDn. Another FL framework, Flower, is also employed to evaluate the performance of FEDn. Experiments were conducted comparing the FL models to the conventional centralised learning model, as well as comparing the two frameworks to each other. The results showed that under certain circumstances, the FL models performed better than the centralised model in detecting spoofing. FEDn was equivalent to Flower in terms of detection performance. In addition, the results indicated that Flower was marginally faster than FEDn. It is assumed that variations in the experimental setup and stochasticity account for the performance disparity.
|
82 |
Federated Learning for Time Series Forecasting Using LSTM Networks: Exploiting Similarities Through Clustering / Federerad inlärning för tidserieprognos genom LSTM-nätverk: utnyttjande av likheter genom klustringDíaz González, Fernando January 2019 (has links)
Federated learning poses a statistical challenge when training on highly heterogeneous sequence data. For example, time-series telecom data collected over long intervals regularly shows mixed fluctuations and patterns. These distinct distributions are an inconvenience when a node not only plans to contribute to the creation of the global model but also plans to apply it on its local dataset. In this scenario, adopting a one-fits-all approach might be inadequate, even when using state-of-the-art machine learning techniques for time series forecasting, such as Long Short-Term Memory (LSTM) networks, which have proven to be able to capture many idiosyncrasies and generalise to new patterns. In this work, we show that by clustering the clients using these patterns and selectively aggregating their updates in different global models can improve local performance with minimal overhead, as we demonstrate through experiments using realworld time series datasets and a basic LSTM model. / Federated Learning utgör en statistisk utmaning vid träning med starkt heterogen sekvensdata. Till exempel så uppvisar tidsseriedata inom telekomdomänen blandade variationer och mönster över längre tidsintervall. Dessa distinkta fördelningar utgör en utmaning när en nod inte bara ska bidra till skapandet av en global modell utan även ämnar applicera denna modell på sin lokala datamängd. Att i detta scenario införa en global modell som ska passa alla kan visa sig vara otillräckligt, även om vi använder oss av de mest framgångsrika modellerna inom maskininlärning för tidsserieprognoser, Long Short-Term Memory (LSTM) nätverk, vilka visat sig kunna fånga komplexa mönster och generalisera väl till nya mönster. I detta arbete visar vi att genom att klustra klienterna med hjälp av dessa mönster och selektivt aggregera deras uppdateringar i olika globala modeller kan vi uppnå förbättringar av den lokal prestandan med minimala kostnader, vilket vi demonstrerar genom experiment med riktigt tidsseriedata och en grundläggande LSTM-modell.
|
83 |
Federated Learning for Time Series Forecasting Using Hybrid ModelLi, Yuntao January 2019 (has links)
Time Series data has become ubiquitous thanks to affordable edge devices and sensors. Much of this data is valuable for decision making. In order to use these data for the forecasting task, the conventional centralized approach has shown deficiencies regarding large data communication and data privacy issues. Furthermore, Neural Network models cannot make use of the extra information from the time series, thus they usually fail to provide time series specific results. Both issues expose a challenge to large-scale Time Series Forecasting with Neural Network models. All these limitations lead to our research question:Can we realize decentralized time series forecasting with a Federated Learning mechanism that is comparable to the conventional centralized setup in forecasting performance?In this work, we propose a Federated Series Forecasting framework, resolving the challenge by allowing users to keep the data locally, and learns a shared model by aggregating locally computed updates. Besides, we design a hybrid model to enable Neural Network models utilizing the extra information from the time series to achieve a time series specific learning. In particular, the proposed hybrid outperforms state-of-art baseline data-central models with NN5 and Ericsson KPI data. Meanwhile, the federated settings of purposed model yields comparable results to data-central settings on both NN5 and Ericsson KPI data. These results together answer the research question of this thesis. / Tidseriedata har blivit allmänt förekommande tack vare överkomliga kantenheter och sensorer. Mycket av denna data är värdefull för beslutsfattande. För att kunna använda datan för prognosuppgifter har den konventionella centraliserade metoden visat brister avseende storskalig datakommunikation och integritetsfrågor. Vidare har neurala nätverksmodeller inte klarat av att utnyttja den extra informationen från tidsserierna, vilket leder till misslyckanden med att ge specifikt tidsserierelaterade resultat. Båda frågorna exponerar en utmaning för storskalig tidsserieprognostisering med neurala nätverksmodeller. Alla dessa begränsningar leder till vår forskningsfråga:Kan vi realisera decentraliserad tidsserieprognostisering med en federerad lärningsmekanism som presterar jämförbart med konventionella centrala lösningar i prognostisering?I det här arbetet föreslår vi ett ramverk för federerad tidsserieprognos som löser utmaningen genom att låta användaren behålla data lokalt och lära sig en delad modell genom att aggregera lokalt beräknade uppdateringar. Dessutom utformar vi en hybrid modell för att möjliggöra neurala nätverksmodeller som kan utnyttja den extra informationen från tidsserierna för att uppnå inlärning av specifika tidsserier. Den föreslagna hybrida modellen presterar bättre än state-of-art centraliserade grundläggande modeller med NN5och Ericsson KPIdata. Samtidigt ger den federerade ansatsen jämförbara resultat med de datacentrala ansatserna för både NN5och Ericsson KPI-data. Dessa resultat svarar tillsammans på forskningsfrågan av denna avhandling.
|
84 |
Models and Representation Learning Mechanisms for Graph DataSusheel Suresh (14228138) 15 December 2022 (has links)
<p>Graph representation learning (GRL) has been increasing used to model and understand data from a wide variety of complex systems spanning social, technological, bio-chemical and physical domains. GRL consists of two main components (1) a parametrized encoder that provides representations of graph data and (2) a learning process to train the encoder parameters. Designing flexible encoders that capture the underlying invariances and characteristics of graph data are crucial to the success of GRL. On the other hand, the learning process drives the quality of the encoder representations and developing principled learning mechanisms are vital for a number of growing applications in self-supervised, transfer and federated learning settings. To this end, we propose a suite of models and learning algorithms for GRL which form the two main thrusts of this dissertation.</p>
<p><br></p>
<p>In Thrust I, we propose two novel encoders which build upon on a widely popular GRL encoder class called graph neural networks (GNNs). First, we empirically study the prediction performance of current GNN based encoders when applied to graphs with heterogeneous node mixing patterns using our proposed notion of local assortativity. We find that GNN performance in node prediction tasks strongly correlates with our local assortativity metric---thereby introducing a limit. We propose to transform the input graph into a computation graph with proximity and structural information as distinct types of edges. We then propose a novel GNN based encoder that operates on this computation graph and adaptively chooses between structure and proximity information. Empirically, adopting our transformation and encoder framework leads to improved node classification performance compared to baselines in real-world graphs that exhibit diverse mixing.</p>
<p>Secondly, we study the trade-off between expressivity and efficiency of GNNs when applied to temporal graphs for the task of link ranking. We develop an encoder that incorporates a labeling approach designed to allow for efficient inference over the candidate set jointly, while provably boosting expressivity. We also propose to optimize a list-wise loss for improved ranking. With extensive evaluation on real-world temporal graphs, we demonstrate its improved performance and efficiency compared to baselines.</p>
<p><br></p>
<p>In Thrust II, we propose two principled encoder learning mechanisms for challenging and realistic graph data settings. First, we consider a scenario where only limited or even no labelled data is available for GRL. Recent research has converged on graph contrastive learning (GCL), where GNNs are trained to maximize the correspondence between representations of the same graph in its different augmented forms. However, we find that GNNs trained by traditional GCL often risk capturing redundant graph features and thus may be brittle and provide sub-par performance in downstream tasks. We then propose a novel principle, termed adversarial-GCL (AD-GCL), which enables GNNs to avoid capturing redundant information during the training by optimizing adversarial graph augmentation strategies used in GCL. We pair AD-GCL with theoretical explanations and design a practical instantiation based on trainable edge-dropping graph augmentation. We experimentally validate AD-GCL by comparing with state-of-the-art GCL methods and achieve performance gains in semi-supervised, unsupervised and transfer learning settings using benchmark chemical and biological molecule datasets. </p>
<p>Secondly, we consider a scenario where graph data is silo-ed across clients for GRL. We focus on two unique challenges encountered when applying distributed training to GRL: (i) client task heterogeneity and (ii) label scarcity. We propose a novel learning framework called federated self-supervised graph learning (FedSGL), which first utilizes a self-supervised objective to train GNNs in a federated fashion across clients and then, each client fine-tunes the obtained GNNs based on its local task and available labels. Our framework enables the federated GNN model to extract patterns from the common feature (attribute and graph topology) space without the need of labels or being biased by heterogeneous local tasks. Extensive empirical study of FedSGL on both node and graph classification tasks yields fruitful insights into how the level of feature / task heterogeneity, the adopted federated algorithm and the level of label scarcity affects the clients’ performance in their tasks.</p>
|
85 |
Privacy leaks from deep linear networks : Information leak via shared gradients in federated learning systems / Sekretessläckor från djupa linjära nätverk : Informationsläckor via delning av gradienter i distribuerade lärande systemShi, Guangze January 2022 (has links)
The field of Artificial Intelligence (AI) has always faced two major challenges. The first is that data is kept scattered and cannot be collected for more efficiently use. The second is that data privacy and security need to be continuously strengthened. Based on these two points, federated learning is proposed as an emerging machine learning scheme. The idea of federated learning is to collaboratively train neural networks on servers. Each user receives the current weights of the network and then sequentially sends parameter updates (gradients) based on their own data. Because the input data remains on-device and only the parameter gradients are shared, this scheme is considered to be effective in preserving data privacy. Some previous attacks also provide a false sense of security since they only succeed in contrived settings, even for a single image. Our research mainly focus on attacks on shared gradients, showing experimentally that private training data can be obtained from publicly shared gradients. We do experiments on both linear-based and convolutional-based deep networks, whose results show that our attack is capable of creating a threat to data privacy, and this threat is independent of the specific structure of neural networks. The method presented in this paper is only to illustrate that it is feasible to recover user data from shared gradients, and cannot be used as an attack to obtain privacy in large quantities. The goal is to spark further research on federated learning, especially gradient security. We also make some brief discussion on possible strategies against our attack methods of privacy. Different methods have their own advantages and disadvantages in terms of privacy protection. Therefore, data pre-processing and network structure adjustment may need to be further researched, so that the process of training the models can achieve better privacy protection while maintaining high precision. / Området artificiell intelligens har alltid stått inför två stora utmaningar. Den första är att data hålls utspridda och inte kan samlas in för mer effektiv användning. Det andra är att datasekretess och säkerhet behöver stärkas kontinuerligt. Baserat på dessa två punkter föreslås federerat lärande som ett framväxande angreppssätt inom maskininlärning. Tanken med federerat lärande är att tillsammans träna neurala nätverk på servrar. Varje användare får nätverkets aktuella vikter och skickar sedan parameteruppdateringar (gradienter) sekventiellt baserat på sina egna data. Eftersom indata förblir på enheten och endast parametergradienterna delas, anses detta schema vara effektivt för att bevara datasekretessen. Vissa tidigare attacker ger också en falsk känsla av säkerhet eftersom de bara lyckas i konstruerade inställningar, även för en enda bild. Vår forskning fokuserar främst på attacker på delade gradienter, och visar experimentellt att privat träningsdata kan erhållas från offentligt delade gradienter. Vi gör experiment på både linjärbaserade och faltningsbaserade djupa nätverk, vars resultat visar att vår attack kan skapa ett hot mot dataintegriteten, och detta hot är oberoende av den specifika strukturen hos djupa nätverk. Metoden som presenteras i denna rapport är endast för att illustrera att det är möjligt att rekonstruera användardata från delade gradienter, och kan inte användas som en attack för att erhålla integritet i stora mängder. Målet är att få igång ytterligare forskning om federerat lärande, särskilt gradientsäkerhet. Vi gör också en kort diskussion om möjliga strategier mot våra attackmetoder för integritet. Olika metoder har sina egna fördelar och nackdelar när det gäller integritetsskydd. Därför kan förbearbetning av data och justering av nätverksstruktur behöva undersökas ytterligare, så att processen med att träna modellerna kan uppnå bättre integritetsskydd samtidigt som hög precision bibehålls.
|
86 |
NETWORK-AWARE FEDERATED LEARNING ACROSS HIGHLY HETEROGENEOUS EDGE/FOG NETWORKSSu Wang (17592381) 09 December 2023 (has links)
<p dir="ltr">The parallel growth of contemporary machine learning (ML) technologies alongside edge/-fog networking has necessitated the development of novel paradigms to effectively manage their intersection. Specifically, the proliferation of edge devices equipped with data generation and ML model training capabilities has given rise to an alternative paradigm called federated learning (FL), moving away from traditional centralized ML common in cloud-based networks. FL involves training ML models directly on edge devices where data are generated.</p><p dir="ltr">A fundamental challenge of FL lies in the extensive heterogeneity inherent to edge/fog networks, which manifests in various forms such as (i) statistical heterogeneity: edge devices have distinct underlying data distributions, (ii) structural heterogeneity: edge devices have diverse physical hardware, (iii) data quality heterogeneity: edge devices have varying ratios of labeled and unlabeled data, and (iv) adversarial compromise: some edge devices may be compromised by adversarial attacks. This dissertation endeavors to capture and model these intricate relationships at the intersection of FL and highly heterogeneous edge/fog networks. To do so, this dissertation will initially develop closed-form expressions for the trade-offs between ML performance and resource cost considerations within edge/fog networks. Subsequently, it optimizes the fundamental processes of FL, encompassing aspects such as batch size control for stochastic gradient descent (SGD) and sampling for global aggregations. This optimization is jointly formulated with networking considerations, which include communication resource consumption and device-to-device (D2D) cooperation.</p><p dir="ltr">In the former half of the dissertation, the emphasis is first on optimizing device sampling for global aggregations in FL, and then on developing a self-sufficient hierarchical meta-learning approach for FL. These methodologies maximize expected ML model performance while addressing common challenges associated with statistical and system heterogeneity. Novel techniques, such as management of D2D data offloading, adaptive CPU clock cycle control, integration of meta-learning, and much more, enable these methodologies. In particular, the proposed hierarchical meta-learning approach enables rapid integration of new devices in large-scale edge/fog networks.</p><p dir="ltr">The latter half of the dissertation directs its ocus towards emerging forms of heterogeneity in FL scenarios, namely (i) heterogeneity in quantity and quality of local labeled and unlabeled data at edge devices and (ii) heterogeneity in terms of adversarially comprised edge devices. To deal with heterogeneous labeled/unlabeled data across edge networks, this dissertation proposes a novel methodology that enables multi-source to multi-target federated domain adaptation. This proposed methodology views edge devices as sources – devices with mostly labeled data that perform ML model training, or targets - devices with mostly unlabeled data that rely on sources’ ML models, and subsequently optimizes the network relationships. In the final chapter, a novel methodology to improve FL robustness is developed in part by viewing adversarial attacks on FL as a form of heterogeneity.</p>
|
87 |
Confidential Federated Learning with Homomorphic Encryption / Konfidentiellt federat lärande med homomorf krypteringWang, Zekun January 2023 (has links)
Federated Learning (FL), one variant of Machine Learning (ML) technology, has emerged as a prevalent method for multiple parties to collaboratively train ML models in a distributed manner with the help of a central server normally supplied by a Cloud Service Provider (CSP). Nevertheless, many existing vulnerabilities pose a threat to the advantages of FL and cause potential risks to data security and privacy, such as data leakage, misuse of the central server, or the threat of eavesdroppers illicitly seeking sensitive information. Promisingly advanced cryptography technologies such as Homomorphic Encryption (HE) and Confidential Computing (CC) can be utilized to enhance the security and privacy of FL. However, the development of a framework that seamlessly combines these technologies together to provide confidential FL while retaining efficiency remains an ongoing challenge. In this degree project, we develop a lightweight and user-friendly FL framework called Heflp, which integrates HE and CC to ensure data confidentiality and integrity throughout the entire FL lifecycle. Heflp supports four HE schemes to fit diverse user requirements, comprising three pre-existing schemes and one optimized scheme that we design, named Flashev2, which achieves the highest time and spatial efficiency across most scenarios. The time and memory overheads of all four HE schemes are also evaluated and a comparison between the pros and cons of each other is summarized. To validate the effectiveness, Heflp is tested on the MNIST dataset and the Threat Intelligence dataset provided by CanaryBit, and the results demonstrate that it successfully preserves data privacy without compromising model accuracy. / Federated Learning (FL), en variant av Maskininlärning (ML)-teknologi, har framträtt som en dominerande metod för flera parter att samarbeta om att distribuerat träna ML-modeller med hjälp av en central server som vanligtvis tillhandahålls av en molntjänstleverantör (CSP). Trots detta utgör många befintliga sårbarheter ett hot mot FL:s fördelar och medför potentiella risker för datasäkerhet och integritet, såsom läckage av data, missbruk av den centrala servern eller risken för avlyssnare som olagligt söker känslig information. Lovande avancerade kryptoteknologier som Homomorf Kryptering (HE) och Konfidentiell Beräkning (CC) kan användas för att förbättra säkerheten och integriteten för FL. Utvecklingen av en ramverk som sömlöst kombinerar dessa teknologier för att erbjuda konfidentiellt FL med bibehållen effektivitet är dock fortfarande en pågående utmaning. I detta examensarbete utvecklar vi en lättviktig och användarvänlig FL-ramverk som kallas Heflp, som integrerar HE och CC för att säkerställa datakonfidentialitet och integritet under hela FLlivscykeln. Heflp stöder fyra HE-scheman för att passa olika användarbehov, bestående av tre befintliga scheman och ett optimerat schema som vi designar, kallat Flashev2, som uppnår högsta tids- och rumeffektivitet i de flesta scenarier. Tids- och minneskostnaderna för alla fyra HE-scheman utvärderas också, och en jämförelse mellan fördelar och nackdelar sammanfattas. För att validera effektiviteten testas Heflp på MNIST-datasetet och Threat Intelligence-datasetet som tillhandahålls av CanaryBit, och resultaten visar att det framgångsrikt bevarar datasekretessen utan att äventyra modellens noggrannhet.
|
88 |
Dynamic GAN-based Clustering in Federated LearningKim, Yeongwoo January 2020 (has links)
As the era of Industry 4.0 arises, the number of devices that are connectedto a network has increased. The devices continuously generate data that hasvarious information from power consumption to the configuration of thedevices. Since the data have the raw information about each local node inthe network, the manipulation of the information brings a potential to benefitthe network with different methods. However, due to the large amount ofnon-IID data generated in each node, manual operations to process the dataand tune the methods became challenging. To overcome the challenge, therehave been attempts to apply automated methods to build accurate machinelearning models by a subset of collected data or cluster network nodes byleveraging clustering algorithms and using machine learning models withineach cluster. However, the conventional clustering algorithms are imperfectin a distributed and dynamic network due to risk of data privacy, the nondynamicclusters, and the fixed number of clusters. These limitations ofthe clustering algorithms degrade the performance of the machine learningmodels because the clusters may become obsolete over time. Therefore, thisthesis proposes a three-phase clustering algorithm in dynamic environmentsby leveraging 1) GAN-based clustering, 2) cluster calibration, and 3) divisiveclustering in federated learning. GAN-based clustering preserves data becauseit eliminates the necessity of sharing raw data in a network to create clusters.Cluster calibration adds dynamics to fixed clusters by continuously updatingclusters and benefits methods that manage the network. Moreover, the divisiveclustering explores the different number of clusters by iteratively selectingand dividing a cluster into multiple clusters. As a result, we create clustersfor dynamic environments and improve the performance of machine learningmodels within each cluster. / ett nätverk ökat. Enheterna genererar kontinuerligt data som har varierandeinformation, från strömförbrukning till konfigurationen av enheterna. Eftersomdatan innehåller den råa informationen om varje lokal nod i nätverket germanipulation av informationen potential att gynna nätverket med olika metoder.På grund av den stora mängden data, och dess egenskap av att vara icke-o.l.f.,som genereras i varje nod blir manuella operationer för att bearbeta data ochjustera metoderna utmanande. För att hantera utmaningen finns försök med attanvända automatiserade metoder för att bygga precisa maskininlärningsmodellermed hjälp av en mindre mängd insamlad data eller att gruppera nodergenom att utnyttja klustringsalgoritmer och använda maskininlärningsmodellerinom varje kluster. De konventionella klustringsalgoritmerna är emellertidofullkomliga i ett distribuerat och dynamiskt nätverk på grund av risken fördataskydd, de icke-dynamiska klusterna och det fasta antalet kluster. Dessabegränsningar av klustringsalgoritmerna försämrar maskininlärningsmodellernasprestanda eftersom klustren kan bli föråldrade med tiden. Därför föreslårdenna avhandling en trefasklustringsalgoritm i dynamiska miljöer genom attutnyttja 1) GAN-baserad klustring, 2) klusterkalibrering och 3) klyvning avkluster i federerad inlärning. GAN-baserade klustring bevarar dataintegriteteneftersom det eliminerar behovet av att dela rådata i ett nätverk för att skapakluster. Klusterkalibrering lägger till dynamik i klustringen genom att kontinuerligtuppdatera kluster och fördelar metoder som hanterar nätverket. Dessutomdelar den klövlande klustringen olika antal kluster genom att iterativt välja ochdela ett kluster i flera kluster. Som ett resultat skapar vi kluster för dynamiskamiljöer och förbättrar prestandan hos maskininlärningsmodeller inom varjekluster.
|
89 |
Two New Applications of Tensors to Machine Learning for Wireless CommunicationsBhogi, Keerthana 09 September 2021 (has links)
With the increasing number of wireless devices and the phenomenal amount of data that is being generated by them, there is a growing interest in the wireless communications community to complement the traditional model-driven design approaches with data-driven machine learning (ML)-based solutions. However, managing the large-scale multi-dimensional data to maintain the efficiency and scalability of the ML algorithms has obviously been a challenge. Tensors provide a useful framework to represent multi-dimensional data in an integrated manner by preserving relationships in data across different dimensions. This thesis studies two new applications of tensors to ML for wireless communications where the tensor structure of the concerned data is exploited in novel ways.
The first contribution of this thesis is a tensor learning-based low-complexity precoder codebook design technique for a full-dimension multiple-input multiple-output (FD-MIMO) system with a uniform planar antenna (UPA) array at the transmitter (Tx) whose channel distribution is available through a dataset. Represented as a tensor, the FD-MIMO channel is further decomposed using a tensor decomposition technique to obtain an optimal precoder which is a function of Kronecker-Product (KP) of two low-dimensional precoders, each corresponding to the horizontal and vertical dimensions of the FD-MIMO channel. From the design perspective, we have made contributions in deriving a criterion for optimal product precoder codebooks using the obtained low-dimensional precoders. We show that this product codebook design problem is an unsupervised clustering problem on a Cartesian Product Grassmann Manifold (CPM), where the optimal cluster centroids form the desired codebook. We further simplify this clustering problem to a $K$-means algorithm on the low-dimensional factor Grassmann manifolds (GMs) of the CPM which correspond to the horizontal and vertical dimensions of the UPA, thus significantly reducing the complexity of precoder codebook construction when compared to the existing codebook learning techniques.
The second contribution of this thesis is a tensor-based bandwidth-efficient gradient communication technique for federated learning (FL) with convolutional neural networks (CNNs). Concisely, FL is a decentralized ML approach that allows to jointly train an ML model at the server using the data generated by the distributed users coordinated by a server, by sharing only the local gradients with the server and not the raw data. Here, we focus on efficient compression and reconstruction of convolutional gradients at the users and the server, respectively. To reduce the gradient communication overhead, we compress the sparse gradients at the users to obtain their low-dimensional estimates using compressive sensing (CS)-based technique and transmit to the server for joint training of the CNN. We exploit a natural tensor structure offered by the convolutional gradients to demonstrate the correlation of a gradient element with its neighbors. We propose a novel prior for the convolutional gradients that captures the described spatial consistency along with its sparse nature in an appropriate way. We further propose a novel Bayesian reconstruction algorithm based on the Generalized Approximate Message Passing (GAMP) framework that exploits this prior information about the gradients. Through the numerical simulations, we demonstrate that the developed gradient reconstruction method improves the convergence of the CNN model. / Master of Science / The increase in the number of wireless and mobile devices have led to the generation of massive amounts of multi-modal data at the users in various real-world applications including wireless communications. This has led to an increasing interest in machine learning (ML)-based data-driven techniques for communication system design. The native setting of ML is {em centralized} where all the data is available on a single device. However, the distributed nature of the users and their data has also motivated the development of distributed ML techniques. Since the success of ML techniques is grounded in their data-based nature, there is a need to maintain the efficiency and scalability of the algorithms to manage the large-scale data. Tensors are multi-dimensional arrays that provide an integrated way of representing multi-modal data. Tensor algebra and tensor decompositions have enabled the extension of several classical ML techniques to tensors-based ML techniques in various application domains such as computer vision, data-mining, image processing, and wireless communications. Tensors-based ML techniques have shown to improve the performance of the ML models because of their ability to leverage the underlying structural information in the data.
In this thesis, we present two new applications of tensors to ML for wireless applications and show how the tensor structure of the concerned data can be exploited and incorporated in different ways. The first contribution is a tensor learning-based precoder codebook design technique for full-dimension multiple-input multiple-output (FD-MIMO) systems where we develop a scheme for designing low-complexity product precoder codebooks by identifying and leveraging a tensor representation of the FD-MIMO channel. The second contribution is a tensor-based gradient communication scheme for a decentralized ML technique known as federated learning (FL) with convolutional neural networks (CNNs), where we design a novel bandwidth-efficient gradient compression-reconstruction algorithm that leverages a tensor structure of the convolutional gradients. The numerical simulations in both applications demonstrate that exploiting the underlying tensor structure in the data provides significant gains in their respective performance criteria.
|
90 |
Federated Neural Collaborative Filtering for privacy-preserving recommender systemsLangelaar, Johannes, Strömme Mattsson, Adam January 2021 (has links)
In this thesis a number of models for recommender systems are explored, all using collaborative filtering to produce their recommendations. Extra focus is put on two models: Matrix Factorization, which is a linear model and Multi-Layer Perceptron, which is a non-linear model. With an additional purpose of training the models without collecting any sensitive data from the users, both models were implemented with a learning technique that does not require the server's knowledge of the users' data, called federated learning. The federated version of Matrix Factorization is already well-researched, and has proven not to protect the users' data at all; the data is derivable from the information that the users communicate to the server that is necessary for the learning of the model. However, on the federated Multi-Layer Perceptron model, no research could be found. In this thesis, such a model is therefore designed and presented. Arguments are put forth in support of the privacy preservability of the model, along with a proof of the user data not being analytically derivable for the central server. In addition, new ways to further put the protection of the users' data on the test are discussed. All models are evaluated on two different data sets. The first data set contains data on ratings of movies and is called MovieLens 1M. The second is a data set that consists of anonymized fund transactions, provided by the Swedish bank SEB for this thesis. Test results suggest that the federated versions of the models can achieve similar recommendation performance as their non-federated counterparts.
|
Page generated in 0.0945 seconds