Global ETD Search

11	Random projection for high-dimensional optimization / Projection aléatoire pour l'optimisation de grande dimension Vu, Khac Ky 05 July 2016 (has links) À l'ère de la numérisation, les données devient pas cher et facile à obtenir. Cela se traduit par de nombreux nouveaux problèmes d'optimisation avec de très grandes tailles. En particulier, pour le même genre de problèmes, le nombre de variables et de contraintes sont énormes. En outre, dans de nombreux paramètres d'application tels que ceux dans l'apprentissage de la machine, une solution précise est moins préférée que celles approximatives mais robustes. Il est un véritable défi pour les algorithmes traditionnels, qui sont utilisés pour bien travailler avec des problèmes de taille moyenne, pour faire face à ces nouvelles circonstances.Au lieu de développer des algorithmes qui évoluent bien à résoudre ces problèmes directement, une idée naturelle est de les transformer en problèmes de petite taille qui se rapporte fortement aux originaux. Étant donné que les nouvelles sont de tailles gérables, ils peuvent encore être résolus efficacement par des méthodes classiques. Les solutions obtenues par ces nouveaux problèmes, cependant, nous donner un aperçu des problèmes originaux. Dans cette thèse, nous allons exploiter l'idée ci-dessus pour résoudre certains problèmes de grande dimension optimisation. En particulier, nous appliquons une technique spéciale appelée projection aléatoire pour intégrer les données du problème dans les espaces de faible dimension, et de reformuler environ le problème de telle manière qu'il devient très facile à résoudre, mais capte toujours l'information la plus importante.Dans le chapitre 3, nous étudions les problèmes d'optimisation dans leurs formes de faisabilité. En particulier, nous étudions le problème que l'on appelle l'adhésion linéaire restreint. Cette classe contient de nombreux problèmes importants tels que la faisabilité linéaire et entier. Nous proposonsd'appliquer une projection aléatoire aux contraintes linéaires etnous voulons trouver des conditions sur T, de sorte que les deux problèmes de faisabilité sont équivalentes avec une forte probabilité.Dans le chapitre 4, nous continuons à étudier le problème ci-dessus dans le cas où l'ensemble restreint est un ensemble convexe. Nous établissons les relations entre les problèmes originaux et projetés sur la base du concept de la largeur gaussienne, qui est populaire dans la détection comprimé. En particulier, nous montrons que les deux problèmes sont équivalents avec une forte probabilité aussi longtemps que pour une projection aléatoire échantillonné à partir ensemble sous-gaussienne avec grande dimension suffisante (dépend de la largeur gaussienne).Dans le chapitre 5, nous étudions le problème de l'adhésion euclidienne:.. `` Étant donné un vecteur b et un euclidienne ensemble fermé X, décider si b est en Xor pas "Ceci est une généralisation du problème de l'appartenance linéaire restreinte précédemment considéré. Nous employons une gaussienne projection aléatoire T pour l'intégrer à la fois b et X dans un espace de dimension inférieure et étudier la version projetée correspondant. Lorsque X est fini ou dénombrable, en utilisant un argument simple, nous montrons que les deux problèmes sont équivalents presque sûrement quelle que soit la dimension projetée. Dans le cas où X peut être indénombrable, nous prouvons que les problèmes initiaux et prévus sont également équivalentes si la dimension d projetée est proportionnelle à une dimension intrinsèque de l'ensemble X. En particulier, nous employons la définition de doubler la dimension estimer la relation entre les deux problèmes.Dans le chapitre 6, nous proposons d'appliquer des projections aléatoires pour la zone de confiance sous-problème. Nous réduisons le nombre de variables en utilisant une projection aléatoire et prouver que des solutions optimales pour le nouveau problème sont en fait des solutions approchées de l'original. Ce résultat peut être utilisé dans le cadre de confiance-région pour étudier l'optimisation de boîte noire et l'optimisation des produits dérivés libre. / In the digitization age, data becomes cheap and easy to obtain. That results in many new optimization problems with extremely large sizes. In particular, for the same kind of problems, the numbers of variables and constraints are huge. Moreover, in many application settings such as those in Machine learning, an accurate solution is less preferred as approximate but robust ones. It is a real challenge for traditional algorithms, which are used to work well with average-size problems, to deal with these new circumstances.Instead of developing algorithms that scale up well to solve these problems directly, one natural idea is to transform them into small-size problems that strongly relates to the originals. Since the new ones are of manageable sizes, they can still be solved efficiently by classical methods. The solutions obtained by these new problems, however, will provide us insight into the original problems. In this thesis, we will exploit the above idea to solve some high-dimensional optimization problems. In particular, we apply a special technique called random projection to embed the problem data into low dimensional spaces, and approximately reformulate the problem in such a way that it becomes very easy to solve but still captures the most important information. Therefore, by solving the projected problem, we either obtain an approximate solution or an approximate objective value for the original problem.We will apply random projection to study a number of important optimization problems, including linear and integer programming (Chapter 3), convex optimization with linear constraints (Chapter 4), membership and approximate nearest neighbor (Chapter 5) and trust-region subproblems (Chapter 6).In Chapter 3, we study optimization problems in their feasibility forms. In particular, we study the so-called restricted linear membership problem. This class contains many important problems such as linear and integer feasibility. We proposeto apply a random projection to the linear constraints, andwe want to find conditions on T, so that the two feasibility problems are equivalent with high probability.In Chapter 4, we continue to study the above problem in the case the restricted set is a convex set. Under that assumption, we can define a tangent cone at some point with minimal squared error. We establish the relations between the original and projected problems based on the concept of Gaussian width, which is popular in compressed sensing. In particular, we prove thatthe two problems are equivalent with high probability as long as for some random projection sampled from sub-gaussian ensemble with large enough dimension (depends on the gaussian width).In Chapter 5, we study the Euclidean membership problem: ``Given a vector b and a Euclidean closed set X, decide whether b is in Xor not". This is a generalization of the restricted linear membership problem considered previously. We employ a Gaussian random projection T to embed both b and X into a lower dimension space and study the corresponding projected version: ``Decide whether Tb is in T(X) or not". When X is finite or countable, using a straightforward argument, we prove that the two problems are equivalent almost surely regardless the projected dimension. In the case when X may be uncountable, we prove that the original and projected problems are also equivalent if the projected dimension d is proportional to some intrinsic dimension of the set X. In particular, we employ the definition of doubling dimension estimate the relation between the two problems.In Chapter 6, we propose to apply random projections for the trust-region subproblem. We reduce the number of variables by using a random projection and prove that optimal solutions for the new problem are actually approximate solutions of the original. This result can be used in the trust-region framework to study black-box optimization and derivative-free optimization. Réduction de dimension Approximation Optimisation Algorithmes randomisés Dimension reduction Approximation Optimization Randomized algorithms
12	Matrix Sketching in Optimization Gregory Paul Dexter (18414855) 19 April 2024 (has links) <p dir="ltr">Continuous optimization is a fundamental topic both in theoretical computer science and applications of machine learning. Meanwhile, an important idea in the development modern algorithms it the use of randomness to achieve empirical speedup and improved theoretical runtimes. Stochastic gradient descent (SGD) and matrix-multiplication time linear program solvers [1] are two important examples of such achievements. Matrix sketching and related ideas provide a theoretical framework for the behavior of random matrices and vectors that arise in these algorithms, thereby provide a natural way to better understand the behavior of such randomized algorithms. In this dissertation, we consider three general problems in this area.</p> Data structures and algorithms randomized algorithms Random Matrix theory Continuous Optimization RandNLA
13	Combinatorial Optimization for Infinite Games on Graphs Björklund, Henrik January 2005 (has links) Games on graphs have become an indispensable tool in modern computer science. They provide powerful and expressive models for numerous phenomena and are extensively used in computer- aided verification, automata theory, logic, complexity theory, computational biology, etc. The infinite games on finite graphs we study in this thesis have their primary applications in verification, but are also of fundamental importance from the complexity-theoretic point of view. They include parity, mean payoff, and simple stochastic games. We focus on solving graph games by using iterative strategy improvement and methods from linear programming and combinatorial optimization. To this end we consider old strategy evaluation functions, construct new ones, and show how all of them, due to their structural similarities, fit into a unifying combinatorial framework. This allows us to employ randomized optimization methods from combinatorial linear programming to solve the games in expected subexponential time. We introduce and study the concept of a controlled optimization problem, capturing the essential features of many graph games, and provide sufficent conditions for solvability of such problems in expected subexponential time. The discrete strategy evaluation function for mean payoff games we derive from the new controlled longest-shortest path problem, leads to improvement algorithms that are considerably more efficient than the previously known ones, and also improves the efficiency of algorithms for parity games. We also define the controlled linear programming problem, and show how the games are translated into this setting. Subclasses of the problem, more general than the games considered, are shown to belong to NP intersection coNP, or even to be solvable by subexponential algorithms. Finally, we take the first steps in investigating the fixed-parameter complexity of parity, Rabin, Streett, and Muller games. infinite games combinatorial optimization randomized algorithms model checking strategy evaluation functions linear programming iterative improvement local search Computer science Datavetenskap
14	Combinatorial Optimization for Infinite Games on Graphs Björklund, Henrik January 2005 (has links) <p>Games on graphs have become an indispensable tool in modern computer science. They provide powerful and expressive models for numerous phenomena and are extensively used in computer- aided verification, automata theory, logic, complexity theory, computational biology, etc.</p><p>The infinite games on finite graphs we study in this thesis have their primary applications in verification, but are also of fundamental importance from the complexity-theoretic point of view. They include parity, mean payoff, and simple stochastic games.</p><p>We focus on solving graph games by using iterative strategy improvement and methods from linear programming and combinatorial optimization. To this end we consider old strategy evaluation functions, construct new ones, and show how all of them, due to their structural similarities, fit into a unifying combinatorial framework. This allows us to employ randomized optimization methods from combinatorial linear programming to solve the games in expected subexponential time.</p><p>We introduce and study the concept of a controlled optimization problem, capturing the essential features of many graph games, and provide sufficent conditions for solvability of such problems in expected subexponential time.</p><p>The discrete strategy evaluation function for mean payoff games we derive from the new controlled longest-shortest path problem, leads to improvement algorithms that are considerably more efficient than the previously known ones, and also improves the efficiency of algorithms for parity games.</p><p>We also define the controlled linear programming problem, and show how the games are translated into this setting. Subclasses of the problem, more general than the games considered, are shown to belong to NP intersection coNP, or even to be solvable by subexponential algorithms.</p><p>Finally, we take the first steps in investigating the fixed-parameter complexity of parity, Rabin, Streett, and Muller games.</p> Datavetenskap infinite games combinatorial optimization randomized algorithms model checking strategy evaluation functions linear programming iterative improvement local search Datavetenskap Computer science Datavetenskap
15	Efficient Graph Summarization of Large Networks Hajiabadi, Mahdi 24 June 2022 (has links) In this thesis, we study the notion of graph summarization, which is a fundamental task of finding a compact representation of the original graph called the summary. Graph summarization can be used for reducing the footprint of the input graph, better visualization, anonymizing the identity of users, and query answering. There are two different frameworks of graph summarization we consider in this thesis, the utility-based framework and the correction set-based framework. In the utility-based framework, the input graph is summarized until a utility threshold is not violated. In the correction set-based framework a set of correction edges is produced along with the summary graph. In this thesis we propose two algorithms for the utility-based framework and one for the correction set-based framework. All these three algorithms are for static graphs (i.e. graphs that do not change over time). Then, we propose two more utility-based algorithms for fully dynamic graphs (i.e. graphs with edge insertions and deletions). Algorithms for graph summarization can be lossless (summarizing the input graph without losing any information) or lossy (losing some information about the input graph in order to summarize it more). Some of our algorithms are lossless and some lossy, but with controlled utility loss. Our first utility-driven graph summarization algorithm, G-SCIS, is based on a clique and independent set decomposition, that produces optimal compression with zero loss of utility. The compression provided is significantly better than state-of-the-art in lossless graph summarization, while the runtime is two orders of magnitude lower. Our second algorithm is T-BUDS, a highly scalable, utility-driven algorithm for fully controlled lossy summarization. It achieves high scalability by combining memory reduction using Maximum Spanning Tree with a novel binary search procedure. T-BUDS outperforms state-of-the-art drastically in terms of the quality of summarization and is about two orders of magnitude better in terms of speed. In contrast to the competition, we are able to handle web-scale graphs in a single machine without performance impediment as the utility threshold (and size of summary) decreases. Also, we show that our graph summaries can be used as-is to answer several important classes of queries, such as triangle enumeration, Pagerank and shortest paths. We then propose algorithm LDME, a correction set-based graph summarization algorithm that produces compact output representations in a fast and scalable manner. To achieve this, we introduce (1) weighted locality sensitive hashing to drastically reduce the number of comparisons required to find good node merges, (2) an efficient way to compute the best quality merges that produces more compact outputs, and (3) a new sort-based encoding algorithm that is faster and more robust. More interestingly, our algorithm provides performance tuning settings to allow the option of trading compression for running time. On high compression settings, LDME achieves compression equal to or better than the state of the art with up to 53x speedup in running time. On high speed settings, LDME achieves up to two orders of magnitude speedup with only slightly lower compression. We also present two lossless summarization algorithms, Optimal and Scalable, for summarizing fully dynamic graphs. More concretely, we follow the framework of G-SCIS, which produces summaries that can be used as-is in several graph analytics tasks. Different from G-SCIS, which is a batch algorithm, Optimal and Scalable are fully dynamic and can respond rapidly to each change in the graph. Not only are Optimal and Scalable able to outperform G-SCIS and other batch algorithms by several orders of magnitude, but they also significantly outperform MoSSo, the state-of-the-art in lossless dynamic graph summarization. While Optimal produces always the most optimal summary, Scalable is able to trade the amount of node reduction for extra scalability. For reasonable values of the parameter $K$, Scalable is able to outperform Optimal by an order of magnitude in speed, while keeping the rate of node reduction close to that of Optimal. An interesting fact that we observed experimentally is that even if we were to run a batch algorithm, such as G-SCIS, once for every big batch of changes, still they would be much slower than Scalable. For instance, if 1 million changes occur in a graph, Scalable is two orders of magnitude faster than running G-SCIS just once at the end of the 1 million-edge sequence. / Graduate Graph Summarization Query Answering Lossless summary Lossy summary Locality Sensitive Hashing Jaccard Similarity Weighted Jaccard Similarity Hashing Incremental Algorithms Randomized Algorithms
16	[en] A CHARACTERIZATION OF TESTABLE GRAPH PROPERTIES IN THE DENSE GRAPH MODEL / [pt] UMA CARACTERIZAÇÃO DE PROPRIEDADES TESTÁVEIS NO MODELO DE GRAFOS DENSOS FELIPE DE OLIVEIRA 19 June 2023 (has links) [pt] Consideramos, nesta dissertação, a questão de determinar se um grafo tem uma propriedade P, tal como G é livre de triângulos ou G é 4- colorível. Em particular, consideramos para quais propriedades P existe um algoritmo aleatório com probabilidades de erro constantes que aceita grafos que satisfazem P e rejeita grafos que são epsilon-longe de qualquer grafo que o satisfaça. Se, além disso, o algoritmo tiver complexidade independente do tamanho do grafo, a propriedade é dita testável. Discutiremos os resultados de Alon, Fischer, Newman e Shapira que obtiveram uma caracterização combinatória de propriedades testáveis de grafos, resolvendo um problema em aberto levantado em 1996. Essa caracterização diz informalmente que uma propriedade P de um grafo é testável se e somente se testar P pode ser reduzido a testar a propriedade de satisfazer uma das finitas partições Szemerédi. / [en] We consider, in this thesis, the question of determining if a graph has a property P such as G is triangle-free or G is 4-colorable. In particular, we consider for which properties P there exists a random algorithm with constant error probabilities that accept graphs that satisfy P and reject graphs that are epsilon-far from any graph that satisfies it. If, in addition, the algorithm has complexity independent of the size of the graph, the property is called testable. We will discuss the results of Alon, Fischer, Newman, and Shapira that obtained a combinatorial characterization of testable graph properties, solving an open problem raised in 1996. This characterization informally says that a graph property P is testable if and only if testing P can be reduced to testing the property of satisfying one of finitely many Szemerédi-partitions. [pt] ALGORITMOS ALEATORIZADOS [pt] O LEMA DE REGULARIDADE DE SZEMEREDI [pt] TESTAGEM DE PROPRIEDADES [pt] ALGORITMOS EM GRAFOS [pt] ALGORITMOS DE APROXIMACAO [en] RANDOMIZED ALGORITHMS [en] SZEMEREDI S REGULARITY LEMMA [en] PROPERTY TESTING [en] GRAPH ALGORITHMS [en] APPROXIMATION ALGORITHMS
17	A parallel iterative solver for large sparse linear systems enhanced with randomization and GPU accelerator, and its resilience to soft errors / Un solveur parallèle itératif pour les grands systèmes linéaires creux, amélioré par la randomisation et l'utilisation des accélérateurs GPU, et sa résilience aux fautes logicielles Jamal, Aygul 28 September 2017 (has links) Dans cette thèse de doctorat, nous abordons trois défis auxquels sont confrontés les solveurs d'algèbres linéaires dans la perspective des futurs systèmes exascale: accélérer la convergence en utilisant des techniques innovantes au niveau algorithmique, en profitant des accélérateurs GPU (Graphics Processing Units) pour améliorer le calcul sur plusieurs systèmes, en évaluant l'impact des erreurs due à l'augmentation du parallélisme dans les superordinateurs. Nous nous intéressons à l'étude des méthodes permettant d'accélérer la convergence et le temps d'exécution des solveurs itératifs pour les grands systèmes linéaires creux. Le solveur plus spécifiquement considéré dans ce travail est le “parallel Algebraic Recursive Multilevel Solver (pARMS)” qui est un soldeur parallèle sur mémoire distribuée basé sur les méthodes de sous-espace de Krylov.Tout d'abord, nous proposons d'intégrer une technique de randomisation appelée “Random Butterfly Transformations (RBT)” qui a été proposée avec succès pour éliminer le coût du pivotage dans la résolution des systèmes linéaires denses. Notre objectif est d'appliquer cette technique dans le préconditionneur ARMS de pARMS pour résoudre plus efficacement le dernier système Complément de Schur dans l'application du processus à multi-niveaux récursif. En raison de l'importance considérable du dernier Complément de Schur pour certains problèmes de test, nous proposons également d'utiliser une variante creux de RBT suivie d'un solveur direct creux (SuperLU). Les résultats expérimentaux sur certaines matrices de la collection de Davis montrent une amélioration de la convergence et de la précision par rapport aux implémentations existantes.Ensuite, nous illustrons comment une approche non intrusive peut être appliquée pour implémenter des calculs GPU dans le solveur pARMS, plus particulièrement pour la phase de préconditionnement locale qui représente une partie importante du temps pour la résolution. Nous comparons les solveurs purement CPU avec les solveurs hybrides CPU / GPU sur plusieurs problèmes de test issus d'applications physiques. Les résultats de performance du solveur hybride CPU / GPU utilisant le préconditionnement ARMS combiné avec RBT, ou le préconditionnement ILU(0), montrent un gain de performance jusqu'à 30% sur les problèmes de test considérés dans nos expériences.Enfin, nous étudions l'effet des défaillances logicielles variable sur la convergence de la méthode itérative flexible GMRES (FGMRES) qui est couramment utilisée pour résoudre le système préconditionné dans pARMS. Le problème ciblé dans nos expériences est un problème elliptique PDE sur une grille régulière. Nous considérons deux types de préconditionneurs: une factorisation LU incomplète à double seuil (ILUT) et le préconditionneur ARMS combiné avec randomisation RBT. Nous considérons deux modèle de fautes logicielles différentes où nous perturbons la multiplication du vecteur matriciel et la phase de préconditionnement, et nous comparons leur impact potentiel sur la convergence. / In this PhD thesis, we address three challenges faced by linear algebra solvers in the perspective of future exascale systems: accelerating convergence using innovative techniques at the algorithm level, taking advantage of GPU (Graphics Processing Units) accelerators to enhance the performance of computations on hybrid CPU/GPU systems, evaluating the impact of errors in the context of an increasing level of parallelism in supercomputers. We are interested in studying methods that enable us to accelerate convergence and execution time of iterative solvers for large sparse linear systems. The solver specifically considered in this work is the parallel Algebraic Recursive Multilevel Solver (pARMS), which is a distributed-memory parallel solver based on Krylov subspace methods.First we integrate a randomization technique referred to as Random Butterfly Transformations (RBT) that has been successfully applied to remove the cost of pivoting in the solution of dense linear systems. Our objective is to apply this method in the ARMS preconditioner to solve more efficiently the last Schur complement system in the application of the recursive multilevel process in pARMS. The experimental results show an improvement of the convergence and the accuracy. Due to memory concerns for some test problems, we also propose to use a sparse variant of RBT followed by a sparse direct solver (SuperLU), resulting in an improvement of the execution time.Then we explain how a non intrusive approach can be applied to implement GPU computing into the pARMS solver, more especially for the local preconditioning phase that represents a significant part of the time to compute the solution. We compare the CPU-only and hybrid CPU/GPU variant of the solver on several test problems coming from physical applications. The performance results of the hybrid CPU/GPU solver using the ARMS preconditioning combined with RBT, or the ILU(0) preconditioning, show a performance gain of up to 30% on the test problems considered in our experiments.Finally we study the effect of soft fault errors on the convergence of the commonly used flexible GMRES (FGMRES) algorithm which is also used to solve the preconditioned system in pARMS. The test problem in our experiments is an elliptical PDE problem on a regular grid. We consider two types of preconditioners: an incomplete LU factorization with dual threshold (ILUT), and the ARMS preconditioner combined with RBT randomization. We consider two soft fault error modeling approaches where we perturb the matrix-vector multiplication and the application of the preconditioner, and we compare their potential impact on the convergence of the solver. Calcul haute performance Algorithmes randomisés Calculs sur GPU GMRES flexible Modèles de fautes logicielles Solveur pARMS Preconditionnement Tolérance aux fautes High performance computing Parallel iterative linear solvers Randomized algorithms GPU computing Flexible GMRES Soft fault models PARMS solver Preconditioning Fault tolerance

Page generated in 0.0299 seconds