• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • Tagged with
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Integrality Gaps for Strong Linear Programming and Semidefinite Programming Relaxations

Georgiou, Konstantinos 17 February 2011 (has links)
The inapproximability for NP-hard combinatorial optimization problems lies in the heart of theoretical computer science. A negative result can be either conditional, where the starting point is a complexity assumption, or unconditional, where the inapproximability holds for a restricted model of computation. Algorithms based on Linear Programming (LP) and Semidefinite Programming (SDP) relaxations are among the most prominent models of computation. The related and common measure of efficiency is the integrality gap, which sets the limitations of the associated algorithmic schemes. A number of systematic procedures, known as lift-and-project systems, have been proposed to improve the integrality gap of standard relaxations. These systems build strong hierarchies of either LP relaxations, such as the Lovasz-Schrijver (LS) and the Sherali-Adams (SA) systems, or SDP relaxations, such as the Lovasz-Schrijver SDP (LS+), the Sherali-Adams SDP (SA+) and the Lasserre (La) systems. In this thesis we prove integrality gap lower bounds for the aforementioned lift-and-project systems and for a number of combinatorial optimization problems, whose inapproximability is yet unresolved. Given that lift-and-project systems produce relaxations that have given the best algorithms known for a series of combinatorial problems, the lower bounds can be read as strong evidence of the inapproximability of the corresponding optimization problems. From the results found in the thesis we highlight the following: For every epsilon>0, the level-Omega(sqrt(log n/ log log n)) LS+ relaxation of the Vertex Cover polytope has integrality gap 2-epsilon. The integrality gap of the standard SDP for Vertex Cover remains 2-o(1) even if all hypermetric inequalities are added to the relaxation. The resulting relaxations are incomparable to the SDP relaxations derived by the LS+ system. Finally, the addition of all ell1 inequalities eliminates all solutions not in the integral hull. For every epsilon>0, the level-Omega(sqrt(log n/ log log n)) SA relaxation of Vertex Cover has integrality gap 2-epsilon. The integrality gap remains tight even for superconstant-level SA+ relaxations. We prove a tight lower bound for the number of tightenings that the SA system needs in order to prove the Pigeonhole Principle. We also prove sublinear and linear rank bounds for the La and SA systems respectively for the Tseitin tautology. Linear levels of the SA+ system treat highly unsatisfiable instances of fixed predicate-P constraint satisfaction problems over q-ary alphabets as fully satisfiable, when the satisfying assignments of the predicates P can be equipped with a balanced and pairwise independent distribution. We study the performance of the Lasserre system on the cut polytope. When the input is the complete graph on 2d+1 vertices, we show that the integrality gap is at least 1+1/(4d(d+1)) for the level-d SDP relaxation.
2

Integrality Gaps for Strong Linear Programming and Semidefinite Programming Relaxations

Georgiou, Konstantinos 17 February 2011 (has links)
The inapproximability for NP-hard combinatorial optimization problems lies in the heart of theoretical computer science. A negative result can be either conditional, where the starting point is a complexity assumption, or unconditional, where the inapproximability holds for a restricted model of computation. Algorithms based on Linear Programming (LP) and Semidefinite Programming (SDP) relaxations are among the most prominent models of computation. The related and common measure of efficiency is the integrality gap, which sets the limitations of the associated algorithmic schemes. A number of systematic procedures, known as lift-and-project systems, have been proposed to improve the integrality gap of standard relaxations. These systems build strong hierarchies of either LP relaxations, such as the Lovasz-Schrijver (LS) and the Sherali-Adams (SA) systems, or SDP relaxations, such as the Lovasz-Schrijver SDP (LS+), the Sherali-Adams SDP (SA+) and the Lasserre (La) systems. In this thesis we prove integrality gap lower bounds for the aforementioned lift-and-project systems and for a number of combinatorial optimization problems, whose inapproximability is yet unresolved. Given that lift-and-project systems produce relaxations that have given the best algorithms known for a series of combinatorial problems, the lower bounds can be read as strong evidence of the inapproximability of the corresponding optimization problems. From the results found in the thesis we highlight the following: For every epsilon>0, the level-Omega(sqrt(log n/ log log n)) LS+ relaxation of the Vertex Cover polytope has integrality gap 2-epsilon. The integrality gap of the standard SDP for Vertex Cover remains 2-o(1) even if all hypermetric inequalities are added to the relaxation. The resulting relaxations are incomparable to the SDP relaxations derived by the LS+ system. Finally, the addition of all ell1 inequalities eliminates all solutions not in the integral hull. For every epsilon>0, the level-Omega(sqrt(log n/ log log n)) SA relaxation of Vertex Cover has integrality gap 2-epsilon. The integrality gap remains tight even for superconstant-level SA+ relaxations. We prove a tight lower bound for the number of tightenings that the SA system needs in order to prove the Pigeonhole Principle. We also prove sublinear and linear rank bounds for the La and SA systems respectively for the Tseitin tautology. Linear levels of the SA+ system treat highly unsatisfiable instances of fixed predicate-P constraint satisfaction problems over q-ary alphabets as fully satisfiable, when the satisfying assignments of the predicates P can be equipped with a balanced and pairwise independent distribution. We study the performance of the Lasserre system on the cut polytope. When the input is the complete graph on 2d+1 vertices, we show that the integrality gap is at least 1+1/(4d(d+1)) for the level-d SDP relaxation.
3

Scalable Parallel Machine Learning on High Performance Computing Systems–Clustering and Reinforcement Learning

Weijian Zheng (14226626) 08 December 2022 (has links)
<p>High-performance computing (HPC) and machine learning (ML) have been widely adopted by both academia and industries to address enormous data problems at extreme scales. While research has reported on the interactions of HPC and ML, achieving high performance and scalability for parallel and distributed ML algorithms is still a challenging task. This dissertation first summarizes the major challenges for applying HPC to ML applications: 1) poor performance and scalability, 2) loss of the convergence rate, 3) lower quality of the trained model, and 4) a lack of performance optimization techniques designed for specific applications. Researchers can address the four challenges in new ML applications. This dissertation shows how to solve them for two specific applications: 1) a clustering algorithm and 2) graph optimization algorithms that use reinforcement learning (RL).</p> <p>As to the clustering algorithm, we first propose an algorithm called the simulated-annealing clustering algorithm. By combining a blocked data layout and asynchronous local optimization within each thread, the simulated-annealing enhanced clustering algorithm has a convergence rate that is comparable to the K-means algorithm but with much higher performance. Experiments with synthetic and real-world datasets show that the simulated-annealing enhanced clustering algorithm is significantly faster than the MPI K-means library using up to 1024 cores. However, the optimization costs (Sum of Square Error (SSE)) of the simulated-annealing enhanced clustering algorithm became higher than the original costs. To tackle this problem, we devise a new algorithm called the full-step feel-the-way clustering algorithm. In the full-step feel-the-way algorithm, there are L local steps within each block of data points. We use the first local step’s results to compute accurate global optimization costs. Our results show that the full-step algorithm can significantly reduce the global number of iterations needed to converge while obtaining low SSE costs. However, the time spent on the local steps is greater than the benefits of the saved iterations. To improve this performance, we next optimize the local step time by incorporating a sampling-based method called reassignment-history-aware sampling. Extensive experiments with various synthetic and real world datasets (e.g., MNIST, CIFAR-10, ENRON, and PLACES-2) show that our parallel algorithms can outperform the fastest open-source MPI K-means implementation by up to 110% on 4,096 CPU cores with comparable SSE costs.</p> <p>Our evaluations of the sampling-based feel-the-way algorithm establish the effectiveness of the local optimization strategy, the blocked data layout, and the sampling methods for addressing the challenges of applying HPC to ML applications. To explore more parallel strategies and optimization techniques, we focus on a more complex application: graph optimization problems using reinforcement learning (RL). RL has proved successful for automatically learning good heuristics to solve graph optimization problems. However, the existing RL systems either do not support graph RL environments or do not support multiple or many GPUs in a distributed setting. This has compromised RL’s ability to solve large scale graph optimization problems due to the lack of parallelization and high scalability. To address the challenges of parallelization and scalability, we develop OpenGraphGym-MG, a high performance distributed-GPU RL framework for solving graph optimization problems. OpenGraphGym-MG focuses on a class of computationally demanding RL problems in which both the RL environment and the policy model are highly computation intensive. In this work, we distribute large-scale graphs across distributed GPUs and use spatial parallelism and data parallelism to achieve scalable performance. We compare and analyze the performance of spatial and data parallelism and highlight their differences. To support graph neural network (GNN) layers that take data samples partitioned across distributed GPUs as input, we design new parallel mathematical kernels to perform operations on distributed 3D sparse and 3D dense tensors. To handle costly RL environments, we design new parallel graph environments to scale up all RL-environment-related operations. By combining the scalable GNN layers with the scalable RL environment, we are able to develop high performance OpenGraphGym-MG training and inference algorithms in parallel.</p> <p>To summarize, after proposing the major challenges for applying HPC to ML applications, this thesis explores several parallel strategies and performance optimization techniques using two ML applications. Specifically, we propose a local optimization strategy, a blocked data layout, and sampling methods for accelerating the clustering algorithm, and we create a spatial parallelism strategy, a parallel graph environment, agent, and policy model, and an optimized replay buffer, and multi-node selection strategy for solving large optimization problems over graphs. Our evaluations prove the effectiveness of these strategies and demonstrate that our accelerations can significantly outperform the state-of-the-art ML libraries and frameworks without loss of quality in trained models.</p>

Page generated in 0.0861 seconds