• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 76
  • 3
  • Tagged with
  • 101
  • 101
  • 42
  • 42
  • 31
  • 29
  • 26
  • 25
  • 23
  • 20
  • 19
  • 18
  • 14
  • 14
  • 14
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

High Probability Guarantees for Federated Learning

Sravani Ramishetty (16679784) 28 July 2023 (has links)
<p>  </p> <p>Federated learning (FL) has emerged as a promising approach for training machine learning models on distributed data while ensuring privacy preservation and data locality. However, one key challenge in FL optimization is the lack of high probability guarantees, which can undermine the trustworthiness of FL solutions. To address this critical issue, we introduce Federated Averaging with post-optimization (FedAvg-PO) method, a modification to the Federated Averaging (FedAvg) algorithm. The proposed algorithm applies a post-optimization phase to evaluate a short list of solutions generated by several independent runs of the FedAvg method. These modifications allow to significantly improve the large-deviation properties of FedAvg which improve the reliability and robustness of the optimization process. The novel complexity analysis shows that FedAvg-PO can compute accurate and statistically guaranteed solutions in the federated learning context. Our result further relaxes the restrictive assumptions in FL theory by developing new technical tools which may be of independent interest. The insights provided by the computational requirements analysis contribute to the understanding of the scalability and efficiency of the algorithm, guiding its practical implementation.</p>
42

Generalization in federated learning

Tenison, Irene 08 1900 (has links)
L'apprentissage fédéré est un paradigme émergent qui permet à un grand nombre de clients disposant de données hétérogènes de coordonner l'apprentissage d'un modèle global unifié sans avoir besoin de partager les données entre eux ou avec un stockage central. Il améliore la confidentialité des données, car celles-ci sont décentralisées et ne quittent pas les dispositifs clients. Les algorithmes standard d'apprentissage fédéré impliquent le calcul de la moyenne des paramètres du modèle ou des mises à jour du gradient pour approcher le modèle global au niveau du serveur. Cependant, dans des environnements hétérogènes, le calcul de la moyenne peut entraîner une perte d'information et conduire à une mauvaise généralisation en raison du biais induit par les gradients dominants des clients. Nous supposons que pour mieux généraliser sur des ensembles de données non-i.i.d., les algorithmes devraient se concentrer sur l'apprentissage du mécanisme invariant qui est constant tout en ignorant les mécanismes parasites qui diffèrent entre les clients. Inspirés par des travaux récents dans la littérature sur la distribution des données, nous proposons une approche de calcul de la moyenne masquée par le gradient pour FL comme alternative au calcul de la moyenne standard des mises à jour des clients. mises à jour des clients. Cette technique d'agrégation des mises à jour des clients peut être adaptée en tant que remplacement dans la plupart des algorithmes fédérés existants. Nous réalisons des expériences approfondies avec l'approche de masquage du gradient sur plusieurs algorithmes FL avec distribution, monde réel et hors distribution (en tant qu'algorithme fédéré). Hors distribution (comme le pire des scénarios) avec des déséquilibres quantitatifs. déséquilibres quantitatifs et montrent qu'elle apporte des améliorations constantes, en particulier dans le cas de clients hétérogènes. clients hétérogènes. Des garanties théoriques viennent étayer l'algorithme proposé. / Federated learning is an emerging paradigm that permits a large number of clients with heterogeneous data to coordinate learning of a unified global model without the need to share data amongst each other or to a central storage. In enhances data privacy as data is decentralized and do not leave the client devices. Standard federated learning algorithms involve averaging of model parameters or gradient updates to approximate the global model at the server. However, in heterogeneous settings averaging can result in information loss and lead to poor generalization due to the bias induced by dominant client gradients. We hypothesize that to generalize better across non-i.i.d datasets, the algorithms should focus on learning the invariant mechanism that is constant while ignoring spurious mechanisms that differ across clients. Inspired from recent works in the Out-of-Distribution literature, we propose a gradient masked averaging approach for FL as an alternative to the standard averaging of client updates. This client update aggregation technique can be adapted as a drop-in replacement in most existing federated algorithms. We perform extensive experiments with gradient masked approach on multiple FL algorithms with in-distribution, real-world, and out-of-distribution (as the worst case scenario) test datasets along with quantity imbalances and show that it provides consistent improvements, particularly in the case of heterogeneous clients. Theoretical guarantees further supports the proposed algorithm.
43

Efficient Decentralized Learning Methods for Deep Neural Networks

Sai Aparna Aketi (18258529) 26 March 2024 (has links)
<p dir="ltr">Decentralized learning is the key to training deep neural networks (DNNs) over large distributed datasets generated at different devices and locations, without the need for a central server. They enable next-generation applications that require DNNs to interact and learn from their environment continuously. The practical implementation of decentralized algorithms brings about its unique set of challenges. In particular, these algorithms should be (a) compatible with time-varying graph structures, (b) compute and communication efficient, and (c) resilient to heterogeneous data distributions. The objective of this thesis is to enable efficient decentralized learning in deep neural networks addressing the abovementioned challenges. Towards this, firstly a communication-efficient decentralized algorithm (Sparse-Push) that supports directed and time-varying graphs with error-compensated communication compression is proposed. Second, a low-precision decentralized training that aims to reduce memory requirements and computational complexity is proposed. Here, we design ”Range-EvoNorm” as the normalization activation layer which is better suited for low-precision decentralized training. Finally, addressing the problem of data heterogeneity, three impactful advancements namely Neighborhood Gradient Mean (NGM), Global Update Tracking (GUT), and Cross-feature Contrastive Loss (CCL) are proposed. NGM utilizes extra communication rounds to obtain cross-agent gradient information whereas GUT tracks global update information with no communication overhead, improving the performance on heterogeneous data. CCL explores an orthogonal direction of using a data-free knowledge distillation approach to handle heterogeneous data in decentralized setups. All the algorithms are evaluated on computer vision tasks using standard image-classification datasets. We conclude this dissertation by presenting a summary of the proposed decentralized methods and their trade-offs for heterogeneous data distributions. Overall, the methods proposed in this thesis address the critical limitations of training deep neural networks in a decentralized setup and advance the state-of-the-art in this domain.</p>
44

Federated Learning for Brain Tumor Segmentation

Evaldsson, Benjamin January 2024 (has links)
This thesis investigates the potential of federated learning (FL) in medical image analysis, addressing the challenges posed by data privacy regulations in accessing medical datasets. The motivation stems from the increasing interest in artificial intelligence (AI)research, particularly in medical imaging for tumor detection using magnetic resonance imaging (MRI) and computer tomography (CT) scans. However, data accessibility remains a significant hurdle due to privacy regulations like the General Data Protection Regulation (GDPR). FL emerges as a solution by focusing on sharing network parameters instead of raw medical data, thus ensuring patient confidentiality. The aims of the study are to understand the requirements for FL models to perform comparably to centrally trained models, explore the impact of different aggregation functions, assess dataset heterogeneity, and evaluate the generalization of FL models. To achieve these goals, this thesis uses the BraTS 2021 dataset, which contains 1251 cases of brain tumor volumes from 23 distinct sites, with different distributions of the data across 3-8 nodes in a federation. The federation is set up to perform brain tumor segmentation, using different forms of aggregationfunctions (FedAvg. FedOpt, and FedProx) to finalize a global model. The final FL models demonstrate similar performance to that of centralized and local models, with minor variations. However, FL models’ performance varies depending on the dataset distribution and aggregation method used. Additionally, this study explores the impact of privacy-preserving techniques, such as differential privacy (DP), on FL model performance. While DP methods generally result in lower performance compared to non-DP methods, their effectiveness varies across different data distributions, and aggregation functions.
45

Over-the-Air Computation for Machine Learning: Model Aggregation via Retransmissions

Hellström, Henrik January 2022 (has links)
With the emerging Internet of Things (IoT) paradigm, more than a billion sensing devices will be collecting an unprecedented amount of data. Simultaneously, the field of data analytics is being revolutionized by modern machine learning (ML) techniques that enable sophisticated processing of massive datasets. Many researchers are envisioning a combination of these two technologies to support exciting applications such as environmental monitoring, Industry 4.0, and vehicular communications. However, traditional wireless communication protocols are inefficient in supporting distributed ML services, where data and computations are distributed over wireless networks. This motivates the need for new wireless communication methods. One such method, over-the-air computation (AirComp), promises to communicate with massive gains in terms of energy, latency, and spectrum efficiency compared to traditional methods. The expected efficiency of AirComp is due to the complete spectrum sharing for all participating devices. Unlike in traditional physical-layer communications, where interference is avoided by allocating orthogonal communication channels, AirComp promotes interference to compute a function of the individually transmitted messages. However, AirComp can not reconstruct functions perfectly but introduces errors in the process, which harms the convergence rate and region of optimality of ML algorithms. The main objective of this thesis is to develop methods that reduce these errors and analyze their effects on ML performance. In the first part of this thesis, we consider the general problem of designing wireless methods for ML applications. In particular, we present an extensive survey which divides the field into two broad categories, digital communications and analog over-the-air-computation. Digital communications refers to orthogonal communication schemes that are optimized for ML metrics, such as classification accuracy, privacy, and data-importance, rather than traditional communication metrics such as fairness, data rate, and reliability. Analog over-the-air-computation refers to the AirComp method and its application to distributed ML, where communication-efficiency, function estimation, and privacy are key concerns. In the second part of this thesis, we focus on the analog over-the-air computation problem. We consider a network setup with multiple devices and a server that can be reached via a single hop, where the wireless channel is modeled as a multiple-access channel with fading and additive noise. Over such a channel, the AirComp function estimate is associated with two types of error: 1) misalignment errors caused by channel fading and 2) noise-induced errors caused by the additive noise. To mitigate these errors, we propose AirComp with retransmissions and develop the optimal power control scheme for such a system. Furthermore, we use optimization theory to derive bounds on the convergence of an AirComp-supported ML system that reveal a relationship between the number of retransmissions and loss of the ML model. Finally, with numerical results we show that retransmissions can significantly improve ML performance, especially for low-SNR scenarios. / Med Internet of Things (IoT)-paradigmen, kommer över en miljard sensorenheter att samla en mängd data som saknar motstycke. Samtidigt har dataanalys revolutionerats av moderna maskininlärningstekniker (ML) som möjliggör avancerad behandling av massiva dataset. Många forskare föreställer sig en kombination av dessa två two teknologier för att möjliggöra spännande applikationer som miljöövervakning, Industri 4.0, och fordonskommunikation. Tyvärr är traditionella kommunikationsprotokoll ineffektiva när det kommer till att stödja distribuerad maskininlärning, där data och beräkningar är utspridda över trådlösa nätverk. Detta motiverar behovet av nya trådlösa kommunikationsprotokoll. Ett protokoll, over-the-air computation (AirComp), lovar att kommunicera med enorma fördelar när det kommer till energieffektivitet, latens, and spektrumeffektivitet jämfört med traditionella protkoll. AirComps effektivitet beror på den fullständiga spektrumdelningen mellan alla medverkande enheter. Till skillnad från traditionell ortogonal kommunikation, där interferens undviks genom att allokera ortogonala radioresurser, så uppmuntrar AirComp interferens och nyttjar den för att räkna ut en funktion av de kommunicerade meddelanderna. Dock kan inte AirComp rekonstruera funktioner perfekt, utan introducerar fel i processen vilket försämrar konvergensen av ML-algoritmer. Det huvudsakliga målet med den här avhandlingen är att utveckla metoder som minskar dessa fel och att analysera de effekter felen har på prestandan av distribuerade ML-algoritmer. I den första delen av avhandlingen behandlar vi det allmänna problemet med att designa trådlösa nätverksprotokoll för att stödja ML. Specifikt så presenterar vi en utförlig kartläggning som delar upp fältet i två kategorier, digital kommunikation och analog AirComp. Digital kommunikation syftar på ortogonala kommunikationsprotokoll som är optimerade för ML-måttstockar, t.ex. klassifikationskapabilitet, integritet, och data-vikt (data-importance), snarare än traditionella kommunikationsmål såsom jämlikhet, datahastighet, och tillförlitlighet. Analog AirComp syftar till AirComps applicering till distribuerad ML, där kommunikationseffektivitet, funktionsestimering, och integritet är viktiga måttstockar. I den andra delen av avhandlingen fokuserar vi på det analoga AirComp-problemet. Vi beaktar ett nätverk med flera enheter och en server som kan nås via en länk, där den trådlösa kanalen modelleras som en multiple-access kanal (MAC) med fädning och additivt brus. Över en sådan kanal så associeras AirComps funktionsestimat med två sorters fel: 1) felinställningsfel orsakade av fädning och 2) brusinducerade fel orsakade av det additiva bruset. För att mildra felen föreslår vi AirComp med återsändning och utvecklar den optimala "power control"-algoritmen för ett sådant system. Dessutom använder vi optimeringsteori för att härleda begränsningar på konvergensen av ett AirCompsystem för distribuerad ML som tydliggör ett förhållande mellan antalet återsändningar och förlustfunktionen för ML-modellen. Slutligen visar vi att återsändningar kan signifikant förbättra ML-prestanda genom numeriska resultat, särskilt när signal-till-brus ration är låg. / <p>QC 20220909</p>
46

Resource Allocation for Federated Learning over Wireless Networks

Jansson, Fredrik January 2022 (has links)
This thesis examines resource allocation for Federated Learning in wireless networks. In Federated learning a server and a number of users exchange neural network parameters during training. This thesis aims to create a realistic simulation of a Federated Learning process by creating a channel model and using compression when channel capacity is insufficient. In the thesis we learn that Federated learning can handle high ratios of sparsification compression. We will also investigate how the choice of users and scheduling schemes affect the convergence speed and accuracy of the training process. This thesis will conclude that the choice of scheduling schemes will depend on the distributed data distribution.
47

Autonomic Management and Orchestration Strategies in MEC-Enabled 5G Networks

Subramanya, Tejas 26 October 2021 (has links)
5G and beyond mobile network technology promises to deliver unprecedented ultra-low latency and high data rates, paving the way for many novel applications and services. Network Function Virtualization (NFV) and Multi-access Edge Computing (MEC) are two technologies expected to play a vital role in achieving ambitious Quality of Service requirements of such applications. While NFV provides flexibility by enabling network functions to be dynamically deployed and inter-connected to realize Service Function Chains (SFC), MEC brings the computing capability to the mobile network's edges, thus reducing latency and alleviating the transport network load. However, adequate mechanisms are needed to meet the dynamically changing network service demands (i.e., in single and multiple domains) and optimally utilize the network resources while ensuring that the end-to-end latency requirement of services is always satisfied. In this dissertation work, we break the problem into three separate stages and present the solutions for each one of them.Firstly, we apply Artificial Intelligence (AI) techniques to drive NFV resource orchestration in MEC-enabled 5G architectures for single and multi-domain scenarios. We propose three deep learning approaches to perform horizontal and vertical Virtual Network Function (VNF) auto-scaling: (i) Multilayer Perceptron (MLP) classification and regression (single-domain), (ii) Centralized Artificial Neural Network (ANN), centralized Long-Short Term Memory (LSTM) and centralized Convolutional Neural Network-LSTM (CNN-LSTM) (single-domain), and (iii) Federated ANN, federated LSTM and federated CNN-LSTM (multi-domain). We evaluate the performance of each of these deep learning models trained over a commercial network operator dataset and investigate the pros and cons of different approaches for VNF auto-scaling. For the first approach, our results show that both MLP classifier and MLP regressor models have strong predicting capability for auto-scaling. However, MLP regressor outperforms MLP classifier in terms of accuracy. For the second approach (one-step prediction), CNN-LSTM performs the best for the QoS-prioritized objective and LSTM performs the best for the cost-prioritized objective. For the second approach (multi-step prediction), the encoder-decoder CNN-LSTM model outperforms the encoder-decoder LSTM model for both QoS and Cost prioritized objectives. For the third approach, both federated LSTM and federated CNN-LSTM models perform equally better than the federated ANN model. It was also noted that in general federated learning approaches performs poorly compared to centralized learning approaches. Secondly, we employ Integer Linear Programming (ILP) techniques to formulate and solve a joint user association and SFC placement problem, where each SFC represents a service requested by a user with end-to-end latency and data rate requirements. We also develop a comprehensive end-to-end latency model considering radio delay, backhaul network delay and SFC processing delay for 5G mobile networks. We evaluated the proposed model using simulations based on real-operator network topology and real-world latency values. Our results show that the average end-to-end latency reduces significantly when SFCs are placed at the ME hosts according to their latency and data rate demands. Furthermore, we propose an heuristic algorithm to address the issue of scalability in ILP, that can solve the above association/mapping problem in seconds rather than hours.Finally, we introduce lightMEC - a lightweight MEC platform for deploying mobile edge computing functionalities which allows hosting of low-latency and bandwidth-intensive applications at the network edge. Measurements conducted over a real-life test demonstrated that lightMEC could actually support practical MEC applications without requiring any change to existing mobile network nodes' functionality in the access and core network segments. The significant benefits of adopting the proposed architecture are analyzed based on a proof-of-concept demonstration of the content caching use case. Furthermore, we introduce the AI-driven Kubernetes orchestration prototype that we implemented by leveraging the lightMEC platform and assess the performance of the proposed deep learning models (from stage 1) in an experimental setup. The prototype evaluations confirm the simulation results achieved in stage 1 of the thesis.
48

<b>MODERN BANDIT OPTIMIZATION WITH STATISTICAL GUARANTEES</b>

Wenjie Li (17506956) 01 December 2023 (has links)
<p dir="ltr">Bandit and optimization represent prominent areas of machine learning research. Despite extensive prior research on these topics in various contexts, modern challenges, such as deal- ing with highly unsmooth nonlinear reward objectives and incorporating federated learning, have sparked new discussions. The X-armed bandit problem is a specialized case where bandit algorithms and blackbox optimization techniques join forces to address noisy reward functions within continuous domains to minize the regret. This thesis concentrates on the X -armed bandit problem in a modern setting. In the first chapter, we introduce an optimal statistical collaboration framework for the single-client X -armed bandit problem, expanding the range of objectives by considering more general smoothness assumptions and empha- sizing tighter statistical error measures to expedite learning. The second chapter addresses the federated X-armed bandit problem, providing a solution for collaboratively optimizing the average global objective while ensuring client privacy. In the third chapter, we confront the more intricate personalized federated X -armed bandit problem. An enhanced algorithm facilitating the simultaneous optimization of all local objectives is proposed.</p>
49

Secure and efficient federated learning

Li, Xingyu 12 May 2023 (has links) (PDF)
In the past 10 years, the growth of machine learning technology has been significant, largely due to the availability of large datasets for training. However, gathering a sufficient amount of data on a central server can be challenging. Additionally, with the rise of mobile networking and the large amounts of data generated by IoT devices, privacy and security issues have become a concern, resulting in government regulations such as GDPR, HIPAA, CCPA, and ADPPA. Under these circumstances, traditional centralized machine learning methods face a problem in that sensitive data must be kept locally for privacy reasons, making it difficult to achieve the desired learning outcomes. Federated learning (FL) offers a solution to this by allowing for a global shared model to be trained by exchanging locally computed optimums instead of sharing the actual data. Despite its success as a natural solution for IoT machine learning implementation, Federated learning (FL) still faces challenges with regards to security and performance. These include high communication costs between IoT devices and the central server, the potential for sensitive information leakage and reduced model precision due to the aggregation process in the distributed IoT network, and performance concerns caused by the heterogeneity of data and devices in the network. In this dissertation, I present practical and effective techniques with strong theoretical supports to address these challenges. To optimize communication resources, I introduce a new multi-server FL framework called MS-FedAvg. To enhance security, I propose a robust defense algorithm called LoMar. To address data heterogeneity, I present FedLGA, and for device heterogeneity, I propose FedSAM.
50

Classifying femur fractures using federated learning

Zhang, Hong January 2024 (has links)
The rarity and subtle radiographic features of atypical femoral fractures (AFF) make it difficult to distinguish radiologically from normal femoral fractures (NFF). Compared with NFF, AFF has subtle radiological features and is associated with the long-term use of bisphosphonates for the treatment of osteoporosis. Automatically classifying AFF and NFF not only helps improve the diagnosis rate of AFF but also helps patients receive timely treatment. In recent years, automatic classification technologies for AFF and NFF have continued to emerge, including but not limited to the use of convolutional neural networks (CNNs), vision transformers (ViTs), and multimodal deep learning prediction models. The above methods are all based on deep learning and require the use of centralized radiograph datasets. However, centralizing medical radiograph data involves issues such as patient privacy and data heterogeneity. Firstly, radiograph data is difficult to share among hospitals, and relevant laws or guidelines prohibit the dissemination of these data; Second, there were overall radiological differences among the different hospital datasets, and deep learning does not fully consider the fusion problem of these multi-source heterogeneous datasets. Based on federated learning, we implemented a distributed deep learning strategy to avoid the use of centralized datasets, thereby protecting the local radiograph datasets of medical institutions and patient privacy. To achieve this goal, we studied approximately 4000 images from 72 hospitals in Sweden, containing 206 AFF patients and 744 NFF patients. By dispersing the radiograph datasets of different hospitals across 3-5 nodes, we can simulate the real-world data distribution scenarios, train the local models of the nodes separately, and aggregate the global model, combined with percentile privacy protection, to further protect the security of the local datasets; in addition, we compare the performance of federated learning models using different aggregation algorithms (FedAvg, FedProx, and FedOpt). In the end, the federated learning global model we obtained is better than these local training models, and the performance of federated learning models is close to the performance of the centralized learning model. It is even better than the centralized learning model in some metrics. We conducted 3-node and 5-node federation learning training respectively. Limited by the data set size of each node, 5-node federated learning does not show any more significant performance than 3-node federated learning. Federated learning is more conducive to collaborative training of high-quality prediction models among medical institutions, but also fully protects sensitive medical data. We believe that it will become a paradigm for collaborative training models in the foreseeable future.

Page generated in 0.0664 seconds