Global ETD Search

21	UNIFYING DISTILLATION WITH PERSONALIZATION IN FEDERATED LEARNING Siddharth Divi (10725357) 29 April 2021 (has links) <div>Federated learning (FL) is a decentralized privacy-preserving learning technique in which clients learn a joint collaborative model through a central aggregator without sharing their data. In this setting, all clients learn a single common predictor (FedAvg), which does not generalize well on each client's local data due to the statistical data heterogeneity among clients. In this paper, we address this problem with PersFL, a discrete two-stage personalized learning algorithm. In the first stage, PersFL finds the optimal teacher model of each client during the FL training phase. In the second stage, PersFL distills the useful knowledge from optimal teachers into each user's local model. The teacher model provides each client with some rich, high-level representation that a client can easily adapt to its local model, which overcomes the statistical heterogeneity present at different clients. We evaluate PersFL on CIFAR-10 and MNIST datasets using three data-splitting strategies to control the diversity between clients' data distributions.</div><div><br></div><div>We empirically show that PersFL outperforms FedAvg and three state-of-the-art personalization methods, pFedMe, Per-FedAvg and FedPer on majority data-splits with minimal communication cost. Further, we study the performance of PersFL on different distillation objectives, how this performance is affected by the equitable notion of fairness among clients, and the number of required communication rounds. We also build an evaluation framework with the following modules: Data Generator, Federated Model Generation, and Evaluation Metrics. We introduce new metrics for the domain of personalized FL, and split these metrics into two perspectives: Performance, and Fairness. We analyze the performance of all the personalized algorithms by applying these metrics to answer the following questions: Which personalization algorithm performs the best in terms of accuracy across all the users?, and Which personalization algorithm is the fairest amongst all of them? Finally, we make the code for this work available at https://tinyurl.com/1hp9ywfa for public use and validation.</div> Applied Computer Science Federated Learning Personalization Neural Networks (NN) Distributed Machine Learning
22	Towards Peer-to-Peer Federated Learning: Algorithms and Comparisons to Centralized Federated Learning Mäenpää, Dylan January 2021 (has links) Due to privacy and regulatory reasons, sharing data between institutions can be difficult. Because of this, real-world data are not fully exploited by machine learning (ML). An emerging method is to train ML models with federated learning (FL) which enables clients to collaboratively train ML models without sharing raw training data. We explored peer-to-peer FL by extending a prominent centralized FL algorithm called Fedavg to function in a peer-to-peer setting. We named this extended algorithm FedavgP2P. Deep neural networks at 100 simulated clients were trained to recognize digits using FedavgP2P and the MNIST data set. Scenarios with IID and non-IID client data were studied. We compared FedavgP2P to Fedavg with respect to models' convergence behaviors and communication costs. Additionally, we analyzed the connection between local client computation, the number of neighbors each client communicates with, and how that affects performance. We also attempted to improve the FedavgP2P algorithm with heuristics based on client identities and per-class F1-scores. The findings showed that by using FedavgP2P, the mean model convergence behavior was comparable to a model trained with Fedavg. However, this came with a varying degree of variation in the 100 models' convergence behaviors and much greater communications costs (at least 14.9x more communication with FedavgP2P). By increasing the amount of local computation up to a certain level, communication costs could be saved. When the number of neighbors a client communicated with increased, it led to a lower variation of the models' convergence behaviors. The FedavgP2P heuristics did not show improved performance. In conclusion, the overall findings indicate that peer-to-peer FL is a promising approach. Peer-to-Peer Federated Learning Decentralized Machine Learning Distributed Machine Learning Computer Sciences Datavetenskap (datalogi)
23	Applied Machine Learning for Online Education Serena Alexis Nicoll (12476796) 28 April 2022 (has links) <p>We consider the problem of developing innovative machine learning tools for online education and evaluate their ability to provide instructional resources. Prediction tasks for student behavior are a complex problem spanning a wide range of topics: we complement current research in student grade prediction and clickstream analysis by considering data from three areas of online learning: Social Learning Networks (SLN), Instructor Feedback, and Learning Management Systems (LMS). In each of these categories, we propose a novel method for modelling data and an associated tool that may be used to assist students and instructors. First, we develop a methodology for analyzing instructor-provided feedback and determining how it correlates with changes in student grades using NLP and NER--based feature extraction. We demonstrate that student grade improvement can be well approximated by a multivariate linear model with average fits across course sections approaching 83\%, and determine several contributors to student success. Additionally, we develop a series of link prediction methodologies that utilize spatial and time-evolving network architectures to pass network state between space and time periods. Through evaluation on six real-world datasets, we find that our method obtains substantial improvements over Bayesian models, linear classifiers, and an unsupervised baseline, with AUCs typically above 0.75 and reaching 0.99. Motivated by Federated Learning, we extend our model of student discussion forums to model an entire classroom as a SLN. We develop a methodology to represent student actions across different course materials in a shared, low-dimensional space that allows characteristics from actions of different types to be passed jointly to a downstream task. Performance comparisons against several baselines in centralized, federated, and personalized learning demonstrate that our model offers more distinctive representations of students in a low-dimensional space, which in turn results in improved accuracy on a common downstream prediction task. Results from these three research thrusts indicate the ability of machine learning methods to accurately model student behavior across multiple data types and suggest their ability to benefit students and instructors alike through future development of assistive tools. </p> Computer Engineering machine learning data science data science for education federated learning social learning networks NLP
24	Privacy-Preserved Federated Learning : A survey of applicable machine learning algorithms in a federated environment Carlsson, Robert January 2020 (has links) There is a potential in the field of medicine and finance of doing collaborative machine learning. These areas gather data which can be used for developing machine learning models that could predict all from sickness in patients to acts of economical crime like fraud. The problem that exists is that the data collected is mostly of confidential nature and should be handled with precaution. This makes the standard way of doing machine learning - gather data at one centralized server - unwanted to achieve. The safety of the data have to be taken into account. In this project we will explore the Federated learning approach of ”bringing the code to the data, instead of data to the code”. It is a decentralized way of doing machine learning where models are trained on connected devices and data is never shared. Keeping the data privacypreserved. machine learning federated learning privacy preserved Information Systems
25	Domain-based Collaborative Learning for Enhanced Health Management of Distributed Industrial Assets Pandhare, Vibhor January 2021 (has links) No description available. Mechanical Engineering Collaborative Learning Federated Learning Domain Adaptation Expectation Maximization Data Privacy Prognostics
26	Decentralized Federated Autonomous Organizations for Prognostics and Health Management Bagheri, Behrad 15 June 2020 (has links) No description available. Computer Science Federated Learning Prognostics and Health Management PHM Homomorphic Encryption Blockchain Machine Learning
27	Joint Resource Management and Task Scheduling for Mobile Edge Computing Wei, Xinliang January 2023 (has links) In recent years, edge computing has become an increasingly popular computing paradigm to enable real-time data processing and mobile intelligence. Edge computing allows computing at the edge of the network, where data is generated and distributed at the nearby edge servers to reduce the data access latency and improve data processing efficiency. In addition, with the advance of Artificial Intelligence of Things (AIoT), not only millions of data are generated from daily smart devices, such as smart light bulbs, smart cameras, and various sensors, but also a large number of parameters of complex machine learning models have to be trained and exchanged by these AIoT devices. Classical cloud-based platforms have difficulty communicating and processing these data/models effectively with sufficient privacy and security protection. Due to the heterogeneity of edge elements including edge servers, mobile users, data resources, and computing tasks, the key challenge is how to effectively manage resources (e.g. data, services) and schedule tasks (e.g. ML/FL tasks) in the edge clouds to meet the QoS of mobile users or maximize the platform's utility. To that end, this dissertation studies joint resource management and task scheduling for mobile edge computing. The key contributions of the dissertation are two-fold. Firstly, we study the data placement problem in edge computing and propose a popularity-based method as well as several load-balancing strategies to effectively place data in the edge network. We further investigate a joint resource placement and task dispatching problem and formulate it as an optimization problem. We propose a two-stage optimization method and a reinforcement learning (RL) method to maximize the total utilities of all tasks. Secondly, we focus on a specific computing task, i.e., federated learning (FL), and study the joint participant selection and learning scheduling problem for multi-model federated edge learning. We formulate a joint optimization problem and propose several multi-stage optimization algorithms to solve the problem. To further improve the FL performance, we leverage the power of the quantum computing (QC) technique and propose a hybrid quantum-classical Benders' decomposition (HQCBD) algorithm as well as a multiple-cuts version to accelerate the convergence speed of the HQCBD algorithm. We show that the proposed algorithms can achieve the consistent optimal value compared with the classical Benders' decomposition running in the classical CPU computer, but with fewer convergence iterations. / Computer and Information Science Computer science Edge computing Federated learning Multi-stage optimization Resource Management
28	RISK INTERPRETATION OF DIFFERENTIAL PRIVACY Jiajun Liang (13190613) 31 July 2023 (has links) <p><br></p><p>How to set privacy parameters is a crucial problem for the consistent application of DP in practice. The current privacy parameters do not provide direct suggestions for this problem. On the other hand, different databases may have varying degrees of information leakage, allowing attackers to enhance their attacks with the available information. This dissertation provides an additional interpretation of the current DP notions by introducing a framework that directly considers the worst-case average failure probability of attackers under different levels of knowledge. </p><p><br></p><p>To achieve this, we introduce a novel measure of attacker knowledge and establish a dual relationship between (type I error, type II error) and (prior, average failure probability). By leveraging this framework, we propose an interpretable paradigm to consistently set privacy parameters on different databases with varying levels of leaked information. </p><p><br></p><p>Furthermore, we characterize the minimax limit of private parameter estimation, driven by $1/(n(1-2p))^2+1/n$, where $p$ represents the worst-case probability risk and $n$ is the number of data points. This characterization is more interpretable than the current lower bound $\min{1/(n\epsilon^2),1/(n\delta^2)}+1/n$ on $(\epsilon,\delta)$-DP. Additionally, we identify the phase transition of private parameter estimation based on this limit and provide suggestions for protocol designs to achieve optimal private estimations. </p><p><br></p><p>Last, we consider a federated learning setting where the data are stored in a distributed manner and privacy-preserving interactions are required. We extend the proposed interpretation to federated learning, considering two scenarios: protecting against privacy breaches against local nodes and protecting privacy breaches against the center. Specifically, we consider a non-convex sparse federated parameter estimation problem and apply it to the generalized linear models. We tackle two challenges in this setting. Firstly, we encounter the issue of initialization due to the privacy requirements that limit the number of queries to the database. Secondly, we overcome the heterogeneity in the distribution among local nodes to identify low-dimensional structures.</p> Statistical data science Statistical theory differential privacy sparse federated learning DP paradigm privacy parameters estimation limit
29	High Probability Guarantees for Federated Learning Sravani Ramishetty (16679784) 28 July 2023 (has links) <p> </p> <p>Federated learning (FL) has emerged as a promising approach for training machine learning models on distributed data while ensuring privacy preservation and data locality. However, one key challenge in FL optimization is the lack of high probability guarantees, which can undermine the trustworthiness of FL solutions. To address this critical issue, we introduce Federated Averaging with post-optimization (FedAvg-PO) method, a modification to the Federated Averaging (FedAvg) algorithm. The proposed algorithm applies a post-optimization phase to evaluate a short list of solutions generated by several independent runs of the FedAvg method. These modifications allow to significantly improve the large-deviation properties of FedAvg which improve the reliability and robustness of the optimization process. The novel complexity analysis shows that FedAvg-PO can compute accurate and statistically guaranteed solutions in the federated learning context. Our result further relaxes the restrictive assumptions in FL theory by developing new technical tools which may be of independent interest. The insights provided by the computational requirements analysis contribute to the understanding of the scalability and efficiency of the algorithm, guiding its practical implementation.</p> Federated learning Probability Guarantees Trustworthiness Scalability Deep Learning
30	Generalization in federated learning Tenison, Irene 08 1900 (has links) L'apprentissage fédéré est un paradigme émergent qui permet à un grand nombre de clients disposant de données hétérogènes de coordonner l'apprentissage d'un modèle global unifié sans avoir besoin de partager les données entre eux ou avec un stockage central. Il améliore la confidentialité des données, car celles-ci sont décentralisées et ne quittent pas les dispositifs clients. Les algorithmes standard d'apprentissage fédéré impliquent le calcul de la moyenne des paramètres du modèle ou des mises à jour du gradient pour approcher le modèle global au niveau du serveur. Cependant, dans des environnements hétérogènes, le calcul de la moyenne peut entraîner une perte d'information et conduire à une mauvaise généralisation en raison du biais induit par les gradients dominants des clients. Nous supposons que pour mieux généraliser sur des ensembles de données non-i.i.d., les algorithmes devraient se concentrer sur l'apprentissage du mécanisme invariant qui est constant tout en ignorant les mécanismes parasites qui diffèrent entre les clients. Inspirés par des travaux récents dans la littérature sur la distribution des données, nous proposons une approche de calcul de la moyenne masquée par le gradient pour FL comme alternative au calcul de la moyenne standard des mises à jour des clients. mises à jour des clients. Cette technique d'agrégation des mises à jour des clients peut être adaptée en tant que remplacement dans la plupart des algorithmes fédérés existants. Nous réalisons des expériences approfondies avec l'approche de masquage du gradient sur plusieurs algorithmes FL avec distribution, monde réel et hors distribution (en tant qu'algorithme fédéré). Hors distribution (comme le pire des scénarios) avec des déséquilibres quantitatifs. déséquilibres quantitatifs et montrent qu'elle apporte des améliorations constantes, en particulier dans le cas de clients hétérogènes. clients hétérogènes. Des garanties théoriques viennent étayer l'algorithme proposé. / Federated learning is an emerging paradigm that permits a large number of clients with heterogeneous data to coordinate learning of a unified global model without the need to share data amongst each other or to a central storage. In enhances data privacy as data is decentralized and do not leave the client devices. Standard federated learning algorithms involve averaging of model parameters or gradient updates to approximate the global model at the server. However, in heterogeneous settings averaging can result in information loss and lead to poor generalization due to the bias induced by dominant client gradients. We hypothesize that to generalize better across non-i.i.d datasets, the algorithms should focus on learning the invariant mechanism that is constant while ignoring spurious mechanisms that differ across clients. Inspired from recent works in the Out-of-Distribution literature, we propose a gradient masked averaging approach for FL as an alternative to the standard averaging of client updates. This client update aggregation technique can be adapted as a drop-in replacement in most existing federated algorithms. We perform extensive experiments with gradient masked approach on multiple FL algorithms with in-distribution, real-world, and out-of-distribution (as the worst case scenario) test datasets along with quantity imbalances and show that it provides consistent improvements, particularly in the case of heterogeneous clients. Theoretical guarantees further supports the proposed algorithm. Apprentissage fédéré Généralisation hors distribution Federated Learning Out of Distribution Generalization

Search results