Global ETD Search

1	PRACTICAL CONFIDENTIALITY-PRESERVING DATA ANALYTICS IN UNTRUSTED CLOUDS Savvas Savvides (9113975) 27 July 2020 (has links) <div> <div> <div> <p>Cloud computing offers a cost-efficient data analytics platform. This is enabled by constant innovations in tools and technologies for analyzing large volumes of data through distributed batch processing systems and real-time data through distributed stream processing systems. However, due to the sensitive nature of data, many organizations are reluctant to analyze their data in public clouds. To address this stalemate, both software-based and hardware-based solutions have been proposed yet all have substantial limitations in terms of efficiency, expressiveness, and security. In this thesis, we present solutions that enable practical and expressive confidentiality- preserving batch and stream-based analytics. We achieve this by performing computations over encrypted data using Partially Homomorphic Encryption (PHE) and Property-Preserving Encryption (PPE) in novel ways, and by utilizing remote or Trusted Execution Environment (TEE) based trusted services where needed.</p><p><br></p><p>We introduce a set of extensions and optimizations to PHE and PPE schemes and propose the novel abstraction of Secure Data Types (SDTs) which enables the application of PHE and PPE schemes in ways that improve performance and security. These abstractions are leveraged to enable a set of compilation techniques making data analytics over encrypted data more practical. When PHE alone is not expressive enough to perform analytics over encrypted data, we use a novel planner engine to decide the most efficient way of utilizing client-side completion, remote re-encryption, or trusted hardware re-encryption based on Intel Software Guard eXtensions (SGX) to overcome the limitations of PHE. We also introduce two novel symmetric PHE schemes that allow arithmetic operations over encrypted data. Being symmetric, our schemes are more efficient than the state-of-the-art asymmetric PHE schemes without compromising the level of security or the range of homomorphic operations they support. We apply the aforementioned techniques in the context of batch data analytics and demonstrate the improvements over previous systems. Finally, we present techniques designed to enable the use of PHE and PPE in resource-constrained Internet of Things (IoT) devices and demonstrate the practicality of stream processing over encrypted data.</p></div></div></div><div><div><div> </div> </div> </div> Distributed Computing Computer System Security Data Encryption Cloud computing Distributed processing of data Applied Crytpography Encrypted Databases Trusted Execution Environments Intel SGX Homomorphic Encryption Confidentiality
2	Distributed Network Processing and Optimization under Communication Constraint Chang Shen Lee (11184969) 26 July 2021 (has links) <div>In recent years, the amount of data in the information processing systems has significantly increased, which is also referred to as big-data. The design of systems handling big-data calls for a scalable approach, which brings distributed systems into the picture. In contrast to centralized systems, data are spread across the network of agents in the distributed system, and agents cooperatively complete tasks through local communications and local computations. However, the design and analysis of distributed systems, in which no central coordinators with complete information are present, are challenging tasks. In order to support communication among agents to enable multi-agent coordination among others, practical communication constraints should be taken into consideration in the design and analysis of such systems. The focus of this dissertation is to provide design and analysis of distributed network processing using finite-rate communications among agents. In particular, we address the following open questions: 1) can one design algorithms balancing a graph weight matrix using finite-rate and simplex communications among agents? 2) can one design algorithms computing the average of agents’ states using finite-rate and simplex communications? and 3) going beyond of ad-hoc algorithmic designs, can one design a black-box mechanism transforming a general class of algorithms with unquantized communication to their finite-bit quantized counterparts?</div><div><br></div><div>This dissertation addresses the above questions. First, we propose novel distributed algorithms solving the weight-balancing and average consensus problems using only finite-rate simplex communications among agents, compliant to the directed nature of the network topology. A novel convergence analysis is put forth, based on a new metric inspired by the</div><div>positional system representations. In the second half of this dissertation, distributed optimization subject to quantized communications is studied. Specifically, we consider a general class of linearly convergent distributed algorithms cast as fixed-point iterate, and propose a novel black-box quantization mechanism. In the proposed mechanism, a novel quantizer preserving linear convergence is proposed, which is proved to be more communication efficient than state-of-the-art quantization mechanisms. Extensive numerical results validate our theoretical findings.</div> Computer Engineering Control Systems, Robotics and Automation Signal Processing Distributed Optimization Quantization decentralized computation decentralized coordination Distributed processing of data distributed computing distributed consensus Fixed point theory.
3	Scalable Parallel Machine Learning on High Performance Computing Systems–Clustering and Reinforcement Learning Weijian Zheng (14226626) 08 December 2022 (has links) <p>High-performance computing (HPC) and machine learning (ML) have been widely adopted by both academia and industries to address enormous data problems at extreme scales. While research has reported on the interactions of HPC and ML, achieving high performance and scalability for parallel and distributed ML algorithms is still a challenging task. This dissertation first summarizes the major challenges for applying HPC to ML applications: 1) poor performance and scalability, 2) loss of the convergence rate, 3) lower quality of the trained model, and 4) a lack of performance optimization techniques designed for specific applications. Researchers can address the four challenges in new ML applications. This dissertation shows how to solve them for two specific applications: 1) a clustering algorithm and 2) graph optimization algorithms that use reinforcement learning (RL).</p> <p>As to the clustering algorithm, we first propose an algorithm called the simulated-annealing clustering algorithm. By combining a blocked data layout and asynchronous local optimization within each thread, the simulated-annealing enhanced clustering algorithm has a convergence rate that is comparable to the K-means algorithm but with much higher performance. Experiments with synthetic and real-world datasets show that the simulated-annealing enhanced clustering algorithm is significantly faster than the MPI K-means library using up to 1024 cores. However, the optimization costs (Sum of Square Error (SSE)) of the simulated-annealing enhanced clustering algorithm became higher than the original costs. To tackle this problem, we devise a new algorithm called the full-step feel-the-way clustering algorithm. In the full-step feel-the-way algorithm, there are L local steps within each block of data points. We use the first local step’s results to compute accurate global optimization costs. Our results show that the full-step algorithm can significantly reduce the global number of iterations needed to converge while obtaining low SSE costs. However, the time spent on the local steps is greater than the benefits of the saved iterations. To improve this performance, we next optimize the local step time by incorporating a sampling-based method called reassignment-history-aware sampling. Extensive experiments with various synthetic and real world datasets (e.g., MNIST, CIFAR-10, ENRON, and PLACES-2) show that our parallel algorithms can outperform the fastest open-source MPI K-means implementation by up to 110% on 4,096 CPU cores with comparable SSE costs.</p> <p>Our evaluations of the sampling-based feel-the-way algorithm establish the effectiveness of the local optimization strategy, the blocked data layout, and the sampling methods for addressing the challenges of applying HPC to ML applications. To explore more parallel strategies and optimization techniques, we focus on a more complex application: graph optimization problems using reinforcement learning (RL). RL has proved successful for automatically learning good heuristics to solve graph optimization problems. However, the existing RL systems either do not support graph RL environments or do not support multiple or many GPUs in a distributed setting. This has compromised RL’s ability to solve large scale graph optimization problems due to the lack of parallelization and high scalability. To address the challenges of parallelization and scalability, we develop OpenGraphGym-MG, a high performance distributed-GPU RL framework for solving graph optimization problems. OpenGraphGym-MG focuses on a class of computationally demanding RL problems in which both the RL environment and the policy model are highly computation intensive. In this work, we distribute large-scale graphs across distributed GPUs and use spatial parallelism and data parallelism to achieve scalable performance. We compare and analyze the performance of spatial and data parallelism and highlight their differences. To support graph neural network (GNN) layers that take data samples partitioned across distributed GPUs as input, we design new parallel mathematical kernels to perform operations on distributed 3D sparse and 3D dense tensors. To handle costly RL environments, we design new parallel graph environments to scale up all RL-environment-related operations. By combining the scalable GNN layers with the scalable RL environment, we are able to develop high performance OpenGraphGym-MG training and inference algorithms in parallel.</p> <p>To summarize, after proposing the major challenges for applying HPC to ML applications, this thesis explores several parallel strategies and performance optimization techniques using two ML applications. Specifically, we propose a local optimization strategy, a blocked data layout, and sampling methods for accelerating the clustering algorithm, and we create a spatial parallelism strategy, a parallel graph environment, agent, and policy model, and an optimized replay buffer, and multi-node selection strategy for solving large optimization problems over graphs. Our evaluations prove the effectiveness of these strategies and demonstrate that our accelerations can significantly outperform the state-of-the-art ML libraries and frameworks without loss of quality in trained models.</p> Graph, social and multimedia data Distributed systems and algorithms High performance computing Reinforcement learning High Performance Computing (HPC) Clustering Algorithm Reinforcement Learning combinatorial optimization problems graph problems Travelling salesperson problem Minimum Vertex Cover Problem Distributed processing of data NP-Hard optimization problems model parallelism data parallelism

1

Page generated in 0.1525 seconds