Global ETD Search

1	Performance Enhancement Schemesand Effective Incentives for Federated Learning Wang, Yuwei 16 November 2021 (has links) The advent of artificial intelligence applications demands for massive amount of data to supplement the training of machine learning models. Traditional machine learning schemes require central processing of large volumes of data that may contain sensitive patterns such as user location, personal information, or transactions history. Federated Learning (FL) has been proposed to complement the traditional centralized methods where multiple local models are trained and aggregated over a centralized cloud server. However, the performance of FL needs to be further improved, since its accuracy is not on par with traditional centralized machine learning approaches. Furthermore, due to the possibility of privacy information leakage, there are not enough clients willing to participate in FL training process. Common practice for the uploaded local models is an evenly weighted aggregation, assuming that each node of the network contributes to advancing the global model equally, which is unfair with higher contribution model owners. This thesis focuses on three aspects of improving a whole federated learning pipeline: client selection; reputation enabled weight aggregation; and incentive mechanism. For client selection, a reputation score consists of evaluation metrics is introduced to eliminate poor performing model contributions. This scheme enhances the original implementation by up to 10% for non-IID datasets. We also reduce the training time of selection scheme by roughly 27.7% compared to the baseline implementation. Then, a reputation-enabled weighted aggregation of the local models for distributed learning is proposed. Thus, the contribution of a local model and its aggregation weight is evaluated and determined by its reputation score, which is formulated as same above. Numerical comparison of the proposed methodology that assigns different aggregation weights based on the accuracy of each model to a baseline that utilizes standard average aggregation weight shows an accuracy improvement of 17.175% over the standard baseline for not independent and identically distributed (non-IID) scenarios for an FL network of 100 participants. Last but not least, for incentive mechanism, we can reward participants based on data quality, data quantity, reputation and resource allocation of participants. In this thesis, we adopt a reputation-aware reverse auction that was earlier proposed to recruit dependable participants for mobile crowdsensing campaigns, and modify that incentive to adapt it to a FL setting where user utility is defined as a function of the assigned payment from the central server and the user’s service cost, such as battery and processor usage. Through numerical results, we show that: 1) the proposed incentive can improve the user utilities when compared to the baseline approaches, 2) platform utility can be maintained at a close value to that under the baselines, 3) the overall test accuracy of the aggregated global model can even slightly improve. Federated Learning Machine Learning
2	Towards a Progressive E-health Application Framework Lu, Zhirui 29 March 2022 (has links) Recent technological advances have opened many new possibilities for health appli- cations. Next generation of networks allows real-time monitoring, collaboration, and diagnosis. Machine Learning and Deep Learning enable modeling and understanding complex and enormous datasets. Yet all the innovations also pose new challenges to application designers and maintainers. To deliver high standard e-health services while following regulations, Quality of Service requirements need to be fulfilled, high accuracy needs to be archived, let along all the security defenses to protect sensitive data from leaking. In this thesis, we present a collection of works towards a progressive framework for building secure, responsive, and intelligent e-health applications, focusing on three major components, Analyze, Acquire, and Authenticate. The framework is progres- sive, as it can be applied to various architectures, growing with the project and adapting to its needs. For newer decentralized applications that perform data anal- ysis locally on users’ devices, powerful models outperforming existing solutions can be built using Deep Learning, while Federated Learning provides further privacy guarantee against data leakage, as shown in the case of sleep stage prediction task using smart watch data. For traditional centralized applications performing com- plex computations on the cloud or on-premise clusters, to provide Quality of Service guarantees for the data acquisition process in a sensor network, a delay estimation model based on queueing theory is proposed and verified using simulation. We also explore the novel idea of using molecular communication for authentication, named Molecular Key, enabling the incorporation of environmental information into security policy. We envision this framework can provide stepping stones for future e-health applications. e-health Federated Learning Molecular Communication
3	On Seven Fundamental Optimization Challenges in Machine Learning Mishchenko, Konstantin 14 October 2021 (has links) Many recent successes of machine learning went hand in hand with advances in optimization. The exchange of ideas between these fields has worked both ways, with ' learning building on standard optimization procedures such as gradient descent, as well as with new directions in the optimization theory stemming from machine learning applications. In this thesis, we discuss new developments in optimization inspired by the needs and practice of machine learning, federated learning, and data science. In particular, we consider seven key challenges of mathematical optimization that are relevant to modern machine learning applications, and develop a solution to each. Our first contribution is the resolution of a key open problem in Federated Learning: we establish the first theoretical guarantees for the famous Local SGD algorithm in the crucially important heterogeneous data regime. As the second challenge, we close the gap between the upper and lower bounds for the theory of two incremental algorithms known as Random Reshuffling (RR) and Shuffle-Once that are widely used in practice, and in fact set as the default data selection strategies for SGD in modern machine learning software. Our third contribution can be seen as a combination of our new theory for proximal RR and Local SGD yielding a new algorithm, which we call FedRR. Unlike Local SGD, FedRR is the first local first-order method that can provably beat gradient descent in communication complexity in the heterogeneous data regime. The fourth challenge is related to the class of adaptive methods. In particular, we present the first parameter-free stepsize rule for gradient descent that provably works for any locally smooth convex objective. The fifth challenge we resolve in the affirmative is the development of an algorithm for distributed optimization with quantized updates that preserves global linear convergence of gradient descent. Finally, in our sixth and seventh challenges, we develop new VR mechanisms applicable to the non-smooth setting based on proximal operators and matrix splitting. In all cases, our theory is simpler, tighter and uses fewer assumptions than the prior literature. We accompany each chapter with numerical experiments to show the tightness of the proposed theoretical results. Optimization Machine learning Federated learning Theory
4	Privacy-aware Federated Learning with Global Differential Privacy Airody Suresh, Spoorthi 31 January 2023 (has links) There is an increasing need for low-power neural systems as neural networks become more widely used in embedded devices with limited resources. Spiking neural networks (SNNs) are proving to be a more energy-efficient option to conventional Artificial neural networks (ANNs), which are recognized for being computationally heavy. Despite its significance, there has been not enough attention on training SNNs on large-scale distributed Machine Learning techniques like Federated Learning (FL). As federated learning involves many energy-constrained devices, there is a significant opportunity to take advantage of the energy efficiency offered by SNNs. However, it is necessary to address the real-world communication constraints in an FL system and this is addressed with the help of three communication reduction techniques, namely, model compression, partial device participation, and periodic aggregation. Furthermore, the convergence of federated learning systems is also affected by data heterogeneity. Federated learning systems are capable of protecting the private data of clients from adversaries. However, by analyzing the uploaded client parameters, confidential information can still be revealed. To combat privacy attacks on the FL systems, various attempts have been made to incorporate differential privacy within the framework. In this thesis, we investigate the trade-offs between communication costs and training variance under a Federated Learning system with Differential Privacy applied at the parameter server (curator model). / Master of Science / Federated Learning is a decentralized method of training neural network models; it employs several participating devices to independently learn a model on their local data partition. These local models are then aggregated at a central server to achieve the same performance as if the model had been trained centrally. But with Federated Learning systems there is a communication overhead accumulated. Various communication reductions can be used to reduce these costs. Spiking Neural Networks, being the energy-efficient option to Artificial Neural Networks, can be utilized in Federated Learning systems. This is because FL systems consist of a network of energy-efficient devices. Federated learning systems are helpful in preserving the privacy of data in the system. However, an attacker can still obtain meaningful information from the parameters that are transmitted during a session. To this end, differential privacy techniques are utilized to combat privacy concerns in Federated Learning systems. In this thesis, we compare and contrast different communication costs and parameters of a federated learning system with differential privacy applied to it. Differential Privacy Federated learning Communication Constraints
5	Towards a Resource Efficient Framework for Distributed Deep Learning Applications Han, Jingoo 24 August 2022 (has links) Distributed deep learning has achieved tremendous success for solving scientific problems in research and discovery over the past years. Deep learning training is quite challenging because it requires training on large-scale massive dataset, especially with graphics processing units (GPUs) in latest high-performance computing (HPC) supercomputing systems. HPC architectures bring different performance trends in training throughput compared to the existing studies. Multiple GPUs and high-speed interconnect are used for distributed deep learning on HPC systems. Extant distributed deep learning systems are designed for non-HPC systems without considering efficiency, leading to under-utilization of expensive HPC hardware. In addition, increasing resource heterogeneity has a negative effect on resource efficiency in distributed deep learning methods including federated learning. Thus, it is important to focus on an increasing demand for both high performance and high resource efficiency for distributed deep learning systems, including latest HPC systems and federated learning systems. In this dissertation, we explore and design novel methods and frameworks to improve resource efficiency of distributed deep learning training. We address the following five important topics: performance analysis on deep learning for supercomputers, GPU-aware deep learning job scheduling, topology-aware virtual GPU training, heterogeneity-aware adaptive scheduling, and token-based incentive algorithm. In the first chapter (Chapter 3), we explore and focus on analyzing performance trend of distributed deep learning on latest HPC systems such as Summitdev supercomputer at Oak Ridge National Laboratory. We provide insights by conducting a comprehensive performance study on how deep learning workloads have effects on the performance of HPC systems with large-scale parallel processing capabilities. In the second part (Chapter 4), we design and develop a novel deep learning job scheduler MARBLE, which considers efficiency of GPU resource based on non-linear scalability of GPUs in a single node and improves GPU utilization by sharing GPUs with multiple deep learning training workloads. The third part of this dissertation (Chapter 5) proposes topology-aware virtual GPU training systems TOPAZ, specifically designed for distributed deep learning on recent HPC systems. In the fourth chapter (Chapter 6), we conduct exploration on an innovative holistic federated learning scheduling that employs a heterogeneity-aware adaptive selection method for improving resource efficiency and accuracy performance, coupled with resource usage profiling and accuracy monitoring to achieve multiple goals. In the fifth part of this dissertation (Chapter 7), we are focused on how to provide incentives to participants according to contribution for reaching high performance of final federated model, while tokens are used as a means of paying for the services of providing participants and the training infrastructure. / Doctor of Philosophy / Distributed deep learning is widely used for solving critical scientific problems with massive datasets. However, to accelerate the scientific discovery, resource efficiency is also important for the deployment on real-world systems, such as high-performance computing (HPC) systems. Deployment of existing deep learning applications on these distributed systems may lead to underutilization of HPC hardware resources. In addition, extreme resource heterogeneity has negative effects on distributed deep learning training. However, much of the prior work has not focused on specific challenges in distributed deep learning including HPC systems and heterogeneous federated systems, in terms of optimizing resource utilization.This dissertation addresses the challenges in improving resource efficiency of distributed deep learning applications, through performance analysis on deep learning for supercomputers, GPU-aware deep learning job scheduling, topology-aware virtual GPU training, and heterogeneity-aware adaptive federated learning scheduling and incentive algorithms. Deep Learning Federated Learning HPC Distributed Systems
6	Towards Communication-Efficient Federated Learning Through Particle Swarm Optimization and Knowledge Distillation Zaman, Saika 01 May 2024 (has links) (PDF) The widespread popularity of Federated Learning (FL) has led researchers to delve into its various facets, primarily focusing on personalization, fair resource allocation, privacy, and global optimization, with less attention puts towards the crucial aspect of ensuring efficient and cost-optimized communication between the FL server and its agents. A major challenge in achieving successful model training and inference on distributed edge devices lies in optimizing communication costs amid resource constraints, such as limited bandwidth, and selecting efficient agents. In resource-limited FL scenarios, where agents often rely on unstable networks, the transmission of large model weights can substantially degrade model accuracy and increase communication latency between the FL server and agents. Addressing this challenge, we propose a novel strategy that integrates a knowledge distillation technique with a Particle Swarm Optimization (PSO)-based FL method. This approach focuses on transmitting model scores instead of weights, significantly reducing communication overhead and enhancing model accuracy in unstable environments. Our method, with potential applications in smart city services and industrial IoT, marks a significant step forward in reducing network communication costs and mitigating accuracy loss, thereby optimizing the communication efficiency between the FL server and its agents. Communication Federated Learning Knowledge Distillation PSO
7	Efficient and Effective Deep Learning Methods for Computer Vision in Centralized and Distributed Applications Mendieta, Matias 01 January 2024 (has links) (PDF) In the rapidly advancing field of computer vision, deep learning has driven significant technological transformations. However, the widespread deployment of these technologies often encounters efficiency challenges, such as high memory usage, demanding computational resources, and extensive communication overhead. Efficiency has become crucial for both centralized and distributed applications of deep learning, ensuring scalability, real-world applicability, and broad accessibility. In distributed settings, federated learning (FL) enables collaborative model training across multiple clients while maintaining data privacy. Despite its promise, FL faces challenges due to clients' constraints in memory, computational power, and bandwidth. Centralized training systems also require high efficiency, where optimizing compute resources during training and inference, as well as label efficiency, can significantly impact the performance and practicality of such models. Addressing these efficiency challenges in both federated learning and centralized training systems promises to provide significant advancements, enabling more extensive and effective deployment of machine learning models across various domains. To this end, this dissertation addresses many key challenges. First, in federated learning, a novel method is introduced to optimize local model performance while reducing memory and computational demands. Additionally, a novel approach is presented to reduce communication costs by minimizing model update frequency across clients through the use of generative models. In the centralized domain, this dissertation further develops a novel training paradigm for geospatial foundation models using a multi-objective continual pretraining strategy. This improves label efficiency and significantly reduces computational requirements for training large-scale models. Overall, this dissertation advances deep learning efficiency by improving memory utilization, computational demands, and communication efficiency, essential for scalable and effective application of deep learning in both distributed and centralized environments. Federated Learning Self-supervised Geospatial Diffusion Models
8	A Resource-Aware Federated Learning Simulation Platform Leandro, Fellipe 07 1900 (has links) The increasing concerns regarding users‘ data privacy leads to the infeasibility of distributed Machine Learning applications, which are usually data-hungry. Federated Learning has emerged as a privacy-preserving distributed machine learning paradigm, in which the client dataset is kept locally, and only the local model parameters are transmitted to the central server. However, adoption of the Federated Learning paradigm leads to new edge computing challenges, since it assumes computationally intensive tasks can be executed locally by each device. The diverse hardware resources in a population of edge devices (e.g., smartphone models) can negatively impact the performance of Federated Learning, at both the global and local levels. This thesis contributes to this context with the implementation of a hardware-aware Federated Learning platform, which provides comprehensive support regarding the impacts of hardware heterogeneity on Federated Learning performance metrics by modeling the costs associated with training tasks on aspects of computation and communication. Federated Learning Resource-awareness Hardware Heterogeneity Simulation Platform
9	A Comparative Study on Aggregation Schemes in Heterogeneous Federated Learning Scenarios Bakambekova, Adilya 03 1900 (has links) The rapid development of Machine Learning algorithms and a growing range of its applications, as well as an increasing number of Edge Computing devices, created a need for a new paradigm that would benefit from both fields. Federated Learning, which emerged as an answer to this need, is a technique that also solves privacy-related issues arising when large amounts of information are collected on many individual devices and being used for a Machine Learning model by sending only the local updates and keeping the data. At the same time, Federated Learning heavily relies on the computational and communicational capabilities of the devices that calculate the updates and send them to the main server to be integrated into a global model using one or the other Aggregation Scheme, which is one of the most important aspects of the Federated Learning. Carefully choosing how to aggregate local updates can diminish the impacts present from a huge variety of devices. Therefore, this thesis work presents a thorough investigation of the Aggregation Schemes and analyzes their behaviors in heterogeneous Federated Learning scenarios. It provides an extensive description of the main features of schemes studied, defines the evaluation criteria, presents the resource costs associated with computational and communicational resources of the devices, and shows a fair assessment. federated learning heterogeneous scenario federated averaging aggregation scheme
10	Intelligent Device Selection in Federated Edge Learning with Energy Efficiency Peng, Cheng 12 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Due to the increasing demand from mobile devices for the real-time response of cloud computing services, federated edge learning (FEL) emerges as a new computing paradigm, which utilizes edge devices to achieve efficient machine learning while protecting their data privacy. Implementing efficient FEL suffers from the challenges of devices' limited computing and communication resources, as well as unevenly distributed datasets, which inspires several existing research focusing on device selection to optimize time consumption and data diversity. However, these studies fail to consider the energy consumption of edge devices given their limited power supply, which can seriously affect the cost-efficiency of FEL with unexpected device dropouts. To fill this gap, we propose a device selection model capturing both energy consumption and data diversity optimization, under the constraints of time consumption and training data amount. Then we solve the optimization problem by reformulating the original model and designing a novel algorithm, named E2DS, to reduce the time complexity greatly. By comparing with two classical FEL schemes, we validate the superiority of our proposed device selection mechanism for FEL with extensive experimental results. Furthermore, for each device in a real FEL environment, it is the fact that multiple tasks will occupy the CPU at the same time, so the frequency of the CPU used for training fluctuates all the time, which may lead to large errors in computing energy consumption. To solve this problem, we deploy reinforcement learning to learn the frequency so as to approach real value. And compared to increasing data diversity, we consider a more direct way to improve the convergence speed using loss values. Then we formulate the optimization problem that minimizes the energy consumption and maximizes the loss values to select the appropriate set of devices. After reformulating the problem, we design a new algorithm FCE2DS as the solution to have better performance on convergence speed and accuracy. Finally, we compare the performance of this proposed scheme with the previous scheme and the traditional scheme to verify the improvement of the proposed scheme in multiple aspects. federated learning edge computing reinforcement learning energy consumption dynamic programming

Search results