1 |
Efficient Building Blocks for Secure Multiparty Computation and Their ApplicationsDonghang Lu (13157568) 27 July 2022 (has links)
<p>Secure multi-party computation (MPC) enables mutually distrusting parties to compute securely over their private data. It is a natural approach for building distributed applications with strong privacy guarantees, and it has been used in more and more real-world privacy-preserving solutions such as privacy-preserving machine learning, secure financial analysis, and secure auctions.</p>
<p><br></p>
<p>The typical method of MPC is to represent the function with arithmetic circuits or binary circuits, then MPC can be applied to compute each gate privately. The practicality of secure multi-party computation (MPC) has been extensively analyzed and improved over the past decade, however, we are hitting the limits of efficiency with the traditional approaches as the circuits become more complicated. Therefore, we follow the design principle of identifying and constructing fast and provably-secure MPC protocols to evaluate useful high-level algebraic abstractions; thus, improving the efficiency of all applications relying on them. </p>
<p><br></p>
<p>To begin with, we construct an MPC protocol to efficiently evaluate the powers of a secret value. Then we use it as a building block to form a secure mixing protocol, which can be directly used for anonymous broadcast communication. We propose two different protocols to achieve secure mixing offering different tradeoffs between local computation and communication. Meanwhile, we study the necessity of robustness and fairness in many use cases, and provide these properties to general MPC protocols. As a follow-up work in this direction, we design more efficient MPC protocols for anonymous communication through the use of permutation matrices. We provide three variants targeting different MPC frameworks and input volumes. Besides, as the core of our protocols is a secure random permutation, our protocol is of independent interest to more applications such as secure sorting and secure two-way communication.</p>
<p><br></p>
<p>Meanwhile, we propose the solution and analysis for another useful arithmetic operation: secure multi-variable high-degree polynomial evaluation over both scalar and matrices. Secure polynomial evaluation is a basic operation in many applications including (but not limited to) privacy-preserving machine learning, secure Markov process evaluation, and non-linear function approximation. In this work, we illustrate how our protocol can be used to efficiently evaluate decision tree models, with both the client input and the tree models being private. We implement the prototypes of this idea and the benchmark shows that the polynomial evaluation becomes significantly faster and this makes the secure comparison the only bottleneck. Therefore, as a follow-up work, we design novel protocols to evaluate secure comparison efficiently with the help of pre-computed function tables. We implement and test this idea using Falcon, a state-of-the-art privacy-preserving machine learning framework and the benchmark results illustrate that we get significant performance improvement by simply replacing their secure comparison protocol with ours.</p>
<p><br></p>
|
2 |
PRIVACY PRESERVING AND EFFICIENT MACHINE LEARNING ALGORITHMSEfstathia Soufleri (19184887) 21 July 2024 (has links)
<p dir="ltr">Extensive data availability has catalyzed the expansion of deep learning. Such advancements include image classification, speech, and natural language processing. However, this data-driven progress is often hindered by privacy restrictions preventing the public release of specific datasets. For example, some vision datasets cannot be shared due to privacy regulations, particularly those containing images depicting visually sensitive or disturbing content. At the same time, it is imperative to deploy deep learning efficiently, specifically Deep Neural Networks (DNNs), which are the core of deep learning. In this dissertation, we focus on achieving efficiency by reducing the computational cost of DNNs in multiple ways.</p><p dir="ltr">This thesis first tackles the privacy concerns arising from deep learning. It introduces a novel methodology that synthesizes and releases synthetic data, instead of private data. Specifically, we propose Differentially Private Image Synthesis (DP-ImgSyn) for generating and releasing synthetic images used for image classification tasks. These synthetic images satisfy the following three properties: (1) they have DP guarantees, (2) they preserve the utility of private images, ensuring that models trained using synthetic images result in comparable accuracy to those trained on private data, and (3) they are visually dissimilar from private images. The DP-ImgSyn framework consists of the following steps: firstly, a teacher model is trained on private images using a DP training algorithm. Subsequently, public images are used for initializing synthetic images, which are optimized in order to be aligned with the private dataset. This optimization leverages the teacher network's batch normalization layer statistics (mean, standard deviation) to inject information from the private dataset into the synthetic images. Third, the synthetic images and their soft labels obtained from the teacher model are released and can be employed for neural network training in image classification tasks.</p><p dir="ltr">As a second direction, this thesis delves into achieving efficiency in deep learning. With neural networks widely deployed for tackling diverse and complex problems, the resulting models often become parameter-heavy, demanding substantial computational resources for deployment. To address this challenge, we focus on quantizing the weights and the activations of DNNs. In more detail, we propose a method for compressing neural networks through layer-wise mixed-precision quantization. Determining the optimal bit widths for each layer is a non-trivial task, given the fact that the search space is exponential. Thus, we employ a Multi-Layer Perceptron (MLP) trained to determine the suitable bit-width for each layer. The Kullback-Leibler (KL) divergence of softmax outputs between the quantized and full precision networks is the metric used to gauge quantization quality. We experimentally investigate the relationship between KL divergence and network size, noting that more aggressive quantization correlates with higher divergence and vice versa. The MLP is trained using the layer-wise bit widths as labels and their corresponding KL divergence as inputs. To generate the training set, pairs of layer-wise bit widths and their respective KL divergence values are obtained through Monte Carlo sampling of the search space. This approach aims to reduce the computational cost of DNN deployment, while maintaining high classification accuracy.</p><p dir="ltr">Additionally, we aim to enhance efficiency in machine learning by introducing a computationally efficient method for action recognition on compressed videos. Rather than decompressing videos for action recognition tasks, our approach performs action recognition directly on the compressed videos. This is achieved by leveraging the modalities within the compressed video format, specifically motion vectors, residuals, and intra-frames. To process each modality, we deploy three neural networks. Our observations indicate a hierarchy in convergence behavior: the network processing intra-frames tend to converge to a flatter minimum than the network processing residuals, which, in turn, converge to a flatter minimum than the motion vector network. This hierarchy motivates our strategy for knowledge transfer among modalities to achieve flatter minima, generally associated with better generalization. Based on this insight, we propose Progressive Knowledge Distillation (PKD), a technique that incrementally transfers knowledge across modalities. This method involves attaching early exits, known as Internal Classifiers (ICs), to the three networks. PKD begins by distilling knowledge from the motion vector network, then the residual network, and finally the intra-frame network, sequentially improving the accuracy of the ICs. Moreover, we introduce Weighted Inference with Scaled Ensemble (WISE), which combines outputs from the ICs using learned weights, thereby boosting accuracy during inference. The combination of PKD and WISE demonstrates significant improvements in efficiency and accuracy for action recognition on compressed videos.</p><p dir="ltr">In summary, this dissertation contributes to advancing privacy preserving and efficient machine learning algorithms. The proposed methodologies offer practical solutions for deploying machine learning systems in real-world scenarios by addressing data privacy and computational efficiency. Through innovative approaches to image synthesis, neural network compression, and action recognition, this work aims to foster the development of robust and scalable machine learning frameworks for diverse computer vision applications.</p>
|
3 |
Towards causal federated learning : a federated approach to learning representations using causal invarianceFrancis, Sreya 10 1900 (has links)
Federated Learning is an emerging privacy-preserving distributed machine learning approach to building a shared model by performing distributed training locally on participating devices (clients) and aggregating the local models into a global one. As this approach prevents data collection and aggregation, it helps in reducing associated privacy risks to a great extent.
However, the data samples across all participating clients are
usually not independent and identically distributed (non-i.i.d.), and Out of Distribution (OOD) generalization for the learned models can be poor. Besides this challenge, federated learning also remains vulnerable to various attacks on security wherein a few malicious participating entities work towards inserting backdoors, degrading the generated aggregated model as well as inferring the data owned by participating entities. In this work, we propose an approach for learning invariant (causal) features common to all participating clients in a federated learning setup and analyse empirically how it enhances the Out of Distribution (OOD) accuracy as well as the privacy of the final learned model. Although Federated Learning allows for participants to contribute their local data without revealing it, it faces issues in data security and in accurately paying participants for quality data contributions. In this report, we also propose an EOS Blockchain design and workflow to establish data security, a novel validation error based metric upon which we qualify gradient uploads for payment, and implement a small example of our Blockchain Causal Federated Learning model to analyze its performance with respect to robustness, privacy and fairness in incentivization. / L’apprentissage fédéré est une approche émergente d’apprentissage automatique distribué
préservant la confidentialité pour créer un modèle partagé en effectuant une formation
distribuée localement sur les appareils participants (clients) et en agrégeant les modèles locaux
en un modèle global. Comme cette approche empêche la collecte et l’agrégation de données,
elle contribue à réduire dans une large mesure les risques associés à la vie privée. Cependant,
les échantillons de données de tous les clients participants sont généralement pas indépendante
et distribuée de manière identique (non-i.i.d.), et la généralisation hors distribution (OOD)
pour les modèles appris peut être médiocre. Outre ce défi, l’apprentissage fédéré reste
également vulnérable à diverses attaques contre la sécurité dans lesquelles quelques entités
participantes malveillantes s’efforcent d’insérer des portes dérobées, dégradant le modèle
agrégé généré ainsi que d’inférer les données détenues par les entités participantes. Dans cet
article, nous proposons une approche pour l’apprentissage des caractéristiques invariantes
(causales) communes à tous les clients participants dans une configuration d’apprentissage
fédérée et analysons empiriquement comment elle améliore la précision hors distribution
(OOD) ainsi que la confidentialité du modèle appris final. Bien que l’apprentissage fédéré
permette aux participants de contribuer leurs données locales sans les révéler, il se heurte à des
problèmes de sécurité des données et de paiement précis des participants pour des contributions
de données de qualité. Dans ce rapport, nous proposons également une conception et un
flux de travail EOS Blockchain pour établir la sécurité des données, une nouvelle métrique
basée sur les erreurs de validation sur laquelle nous qualifions les téléchargements de gradient
pour le paiement, et implémentons un petit exemple de notre modèle d’apprentissage fédéré
blockchain pour analyser ses performances.
|
4 |
An Image-based ML Approach for Wi-Fi Intrusion Detection System and Education Modules for Security and Privacy in MLRayed Suhail Ahmad (18476697) 02 May 2024 (has links)
<p dir="ltr">The research work presented in this thesis focuses on two highly important topics in the modern age. The first topic of research is the development of various image-based Network Intrusion Detection Systems (NIDSs) and performing a comprehensive analysis of their performance. Wi-Fi networks have become ubiquitous in enterprise and home networks which creates opportunities for attackers to target the networks. These attackers exploit various vulnerabilities in Wi-Fi networks to gain unauthorized access to a network or extract data from end users' devices. The deployment of an NIDS helps detect these attacks before they can cause any significant damages to the network's functionalities or security. Within the scope of our research, we provide a comparative analysis of various deep learning (DL)-based NIDSs that utilize various imaging techniques to detect anomalous traffic in a Wi-Fi network. The second topic in this thesis is the development of learning modules for security and privacy in Machine Learning (ML). The increasing integration of ML in various domains raises concerns about its security and privacy. In order to effectively address such concerns, students learning about the basics of ML need to be made aware of the steps that are taken to develop robust and secure ML-based systems. As part of this, we introduce a set of hands-on learning modules designed to educate students on the importance of security and privacy in ML. The modules provide a theoretical learning experience through presentations and practical experience using Python Notebooks. The modules are developed in a manner that allows students to easily absorb the concepts regarding privacy and security of ML models and implement it in real-life scenarios. The efficacy of this process will be obtained from the results of the surveys conducted before and after providing the learning modules. Positive results from the survey will demonstrate the learning modules were effective in imparting knowledge to the students and the need to incorporate security and privacy concepts in introductory ML courses.</p>
|
Page generated in 0.1573 seconds