Global ETD Search

141	Parallel and Decentralized Algorithms for Big-data Optimization over Networks Amir Daneshmand (11153640) 22 July 2021 (has links) <p>Recent decades have witnessed the rise of data deluge generated by heterogeneous sources, e.g., social networks, streaming, marketing services etc., which has naturally created a surge of interests in theory and applications of large-scale convex and non-convex optimization. For example, real-world instances of statistical learning problems such as deep learning, recommendation systems, etc. can generate sheer volumes of spatially/temporally diverse data (up to Petabytes of data in commercial applications) with millions of decision variables to be optimized. Such problems are often referred to as Big-data problems. Solving these problems by standard optimization methods demands intractable amount of centralized storage and computational resources which is infeasible and is the foremost purpose of parallel and decentralized algorithms developed in this thesis.</p><p><br></p><p>This thesis consists of two parts: (I) Distributed Nonconvex Optimization and (II) Distributed Convex Optimization.</p><p><br></p><p>In Part (I), we start by studying a winning paradigm in big-data optimization, Block Coordinate Descent (BCD) algorithm, which cease to be effective when problem dimensions grow overwhelmingly. In particular, we considered a general family of constrained non-convex composite large-scale problems defined on multicore computing machines equipped with shared memory. We design a hybrid deterministic/random parallel algorithm to efficiently solve such problems combining synergically Successive Convex Approximation (SCA) with greedy/random dimensionality reduction techniques. We provide theoretical and empirical results showing efficacy of the proposed scheme in face of huge-scale problems. The next step is to broaden the network setting to general mesh networks modeled as directed graphs, and propose a class of gradient-tracking based algorithms with global convergence guarantees to critical points of the problem. We further explore the geometry of the landscape of the non-convex problems to establish second-order guarantees and strengthen our convergence to local optimal solutions results to global optimal solutions for a wide range of Machine Learning problems.</p><p><br></p><p>In Part (II), we focus on a family of distributed convex optimization problems defined over meshed networks. Relevant state-of-the-art algorithms often consider limited problem settings with pessimistic communication complexities with respect to the complexity of their centralized variants, which raises an important question: can one achieve the rate of centralized first-order methods over networks, and moreover, can one improve upon their communication costs by using higher-order local solvers? To answer these questions, we proposed an algorithm that utilizes surrogate objective functions in local solvers (hence going beyond first-order realms, such as proximal-gradient) coupled with a perturbed (push-sum) consensus mechanism that aims to track locally the gradient of the central objective function. The algorithm is proved to match the convergence rate of its centralized counterparts, up to multiplying network factors. When considering in particular, Empirical Risk Minimization (ERM) problems with statistically homogeneous data across the agents, our algorithm employing high-order surrogates provably achieves faster rates than what is achievable by first-order methods. Such improvements are made without exchanging any Hessian matrices over the network. </p><p><br></p><p>Finally, we focus on the ill-conditioning issue impacting the efficiency of decentralized first-order methods over networks which rendered them impractical both in terms of computation and communication cost. A natural solution is to develop distributed second-order methods, but their requisite for Hessian information incurs substantial communication overheads on the network. To work around such exorbitant communication costs, we propose a “statistically informed” preconditioned cubic regularized Newton method which provably improves upon the rates of first-order methods. The proposed scheme does not require communication of Hessian information in the network, and yet, achieves the iteration complexity of centralized second-order methods up to the statistical precision. In addition, (second-order) approximate nature of the utilized surrogate functions, improves upon the per-iteration computational cost of our earlier proposed scheme in this setting.</p> Distributed Computing Operations Research Optimisation distributed optimization Large-Scale Optimization Distributed Machine Learning decentralized algorithms Parallel Computing convex optimization Nonconvex optimization Parallel algorithms
142	Dense matrix computations : communication cost and numerical stability / Calculs pour les matrices denses : coût de communication et stabilité numérique Khabou, Amal 11 February 2013 (has links) Cette thèse traite d’une routine d’algèbre linéaire largement utilisée pour la résolution des systèmes li- néaires, il s’agit de la factorisation LU. Habituellement, pour calculer une telle décomposition, on utilise l’élimination de Gauss avec pivotage partiel (GEPP). La stabilité numérique de l’élimination de Gauss avec pivotage partiel est caractérisée par un facteur de croissance qui est reste assez petit en pratique. Toutefois, la version parallèle de cet algorithme ne permet pas d’atteindre les bornes inférieures qui ca- ractérisent le coût de communication pour un algorithme donné. En effet, la factorisation d’un bloc de colonnes constitue un goulot d’étranglement en termes de communication. Pour remédier à ce problème, Grigori et al [60] ont développé une factorisation LU qui minimise la communication(CALU) au prix de quelques calculs redondants. En théorie la borne supérieure du facteur de croissance de CALU est plus grande que celle de l’élimination de Gauss avec pivotage partiel, cependant CALU est stable en pratique. Pour améliorer la borne supérieure du facteur de croissance, nous étudions une nouvelle stra- tégie de pivotage utilisant la factorisation QR avec forte révélation de rang. Ainsi nous développons un nouvel algorithme pour la factorisation LU par blocs. La borne supérieure du facteur de croissance de cet algorithme est plus petite que celle de l’élimination de Gauss avec pivotage partiel. Cette stratégie de pivotage est ensuite combinée avec le pivotage basé sur un tournoi pour produire une factorisation LU qui minimise la communication et qui est plus stable que CALU. Pour les systèmes hiérarchiques, plusieurs niveaux de parallélisme sont disponibles. Cependant, aucune des méthodes précédemment ci- tées n’exploite pleinement ces ressources. Nous proposons et étudions alors deux algorithmes récursifs qui utilisent les mêmes principes que CALU mais qui sont plus appropriés pour des architectures à plu- sieurs niveaux de parallélisme. Pour analyser d’une façon précise et réaliste / This dissertation focuses on a widely used linear algebra kernel to solve linear systems, that is the LU decomposition. Usually, to perform such a computation one uses the Gaussian elimination with partial pivoting (GEPP). The backward stability of GEPP depends on a quantity which is referred to as the growth factor, it is known that in general GEPP leads to modest element growth in practice. However its parallel version does not attain the communication lower bounds. Indeed the panel factorization rep- resents a bottleneck in terms of communication. To overcome this communication bottleneck, Grigori et al [60] have developed a communication avoiding LU factorization (CALU), which is asymptotically optimal in terms of communication cost at the cost of some redundant computation. In theory, the upper bound of the growth factor is larger than that of Gaussian elimination with partial pivoting, however CALU is stable in practice. To improve the upper bound of the growth factor, we study a new pivoting strategy based on strong rank revealing QR factorization. Thus we develop a new block algorithm for the LU factorization. This algorithm has a smaller growth factor upper bound compared to Gaussian elimination with partial pivoting. The strong rank revealing pivoting is then combined with tournament pivoting strategy to produce a communication avoiding LU factorization that is more stable than CALU. For hierarchical systems, multiple levels of parallelism are available. However, none of the previously cited methods fully exploit these hierarchical systems. We propose and study two recursive algorithms based on the communication avoiding LU algorithm, which are more suitable for architectures with multiple levels of parallelism. For an accurate and realistic cost analysis of these hierarchical algo- rithms, we introduce a hierarchical parallel performance model that takes into account processor and network hierarchies. This analysis enables us to accurately predict the performance of the hierarchical LU factorization on an exascale platform. Factorisation LU Minimisation de la communication Algorithmes parallèles Systèmes hiérarchiques Modèles de performance Stratégies de pivotage LU factorization Growth factor Minimizing the communication cost Parallel algorithms Hierarchical systems Performance models Pivoting strategies
143	Parallele dynamische Adaption hybrider Netze für effizientes verteiltes Rechnen / Parallel dynamic adaptation of hybrid grids for efficient distributed computing Alrutz, Thomas 17 September 2008 (has links) No description available. 510 Mathematik Mathematics and Computer Science Hybride Netze Netzadaption verteiltes Rechnen parallele Algorithmen Hybrid grids Gridadaptation distributed computing parallel algorithms 31.76 Numerische Mathematik
144	INTERFACE DE ANÁLISE DA INTERCONEXÃO EM UMA LAN USANDO CORBA / Software development (graphical user interface) that makes possible to analyze the interconnection in a LAN (Local Area Network) using CORBA (Common Object Request Broker Architecture) MONTEIRO, Milson Silva 07 June 2002 (has links) Made available in DSpace on 2016-08-17T14:52:43Z (GMT). No. of bitstreams: 1 Milson Monteiro.pdf: 1924077 bytes, checksum: 78f931b493f756dec0edee7a465e1099 (MD5) Previous issue date: 2002-06-07 / Conselho Nacional de Desenvolvimento Científico e Tecnológico / This works concern software development (graphical user interface) that makes possible to analyze the interconnection in a LAN (Local Area Network) using CORBA (Common Object Request Broker Architecture) on distributed and heterogeneous environment among several outlying machines. This works presents paradigms of graphs theory: shortest paths problems (Dijkstra-Ford-Moore-Belman), maximum flow problems (Edmonds-Karp) and minimum cost flow problems (Busacker-Gowen) to formalize the interface development. We discoursed on the graphs theory and networks flows that are essentials to guarantee theoretical insight. / O objeto de estudo deste trabalho é o desenvolvimento de um software (interface gráfica do usuário) que possibilita analisar a interconexão de uma LAN (Local Area Network) usando CORBA (Common Object Request Broker Architecture) em ambientes distribuídos e heterogêneos entre diversas máquinas periféricas. Este trabalho apresenta os paradigmas da teoria de grafos: menor caminho (Dijkstra, Ford-Moore-Belman), fluxo máximo (Edmonds-Karp) e fluxo de custo mínimo (Busacker-Gowen) para formalizar o desenvolvimento da interface. Discorremos sobre a teoria de grafos e fluxos em redes que são relevantes para garantir o embasamento teórico. Teoria de Grafos Fluxos em Redes Problema do Menor Caminho Problema do Fluxo Máximo Problema do Fluxo de Custo Mínimo Algoritmos Paralelos Graph Theory Networks Flows Shortest Path Problem Maximum Flow Problem Minimum Cost Flow Problem Parallel Algorithms
145	Improvements in Genetic Approach to Pole Placement in Linear State Space Systems Through Island Approach PGA with Orthogonal Mutation Vectors Cassell, Arnold 01 January 2012 (has links) This thesis describes a genetic approach for shaping the dynamic responses of linear state space systems through pole placement. This paper makes further comparisons between this approach and an island approach parallel genetic algorithm (PGA) which incorporates orthogonal mutation vectors to increase sub-population specialization and decrease convergence time. Both approaches generate a gain vector K. The vector K is used in state feedback for altering the poles of the system so as to meet step response requirements such as settling time and percent overshoot. To obtain the gain vector K by the proposed genetic approaches, a pair of ideal, desired poles is calculate first. Those poles serve as the basis by which an initial population is created. In the island approach, those poles serve as a basis for n populations, where n is the dimension of the necessary K vector. Each member of the population is tested for its fitness (the degree to which it matches the criteria). A new population is created each “generation” from the results of the previous iteration, until the criteria are met, or a certain number of generations have passed. Several case studies are provided in this paper to illustrate that this new approach is working, and also to compare performance of the two approaches. Thesis University of North Florida UNF Parallel algorithms -- Testing Orthogonalization methods -- Testing Genetic algorithms -- Testing Linear systems Pole Placement Genetic Algorithm State Space Orthogonal Mutation Controls and Control Theory
146	A scalable approach to processing adaptive optics optical coherence tomography data from multiple sensors using multiple graphics processing units Kriske, Jeffery Edward, Jr. 12 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Adaptive optics-optical coherence tomography (AO-OCT) is a non-invasive method of imaging the human retina in vivo. It can be used to visualize microscopic structures, making it incredibly useful for the early detection and diagnosis of retinal disease. The research group at Indiana University has a novel multi-camera AO-OCT system capable of 1 MHz acquisition rates. Until this point, a method has not existed to process data from such a novel system quickly and accurately enough on a CPU, a GPU, or one that can scale to multiple GPUs automatically in an efficient manner. This is a barrier to using a MHz AO-OCT system in a clinical environment. A novel approach to processing AO-OCT data from the unique multi-camera optics system is tested on multiple graphics processing units (GPUs) in parallel with one, two, and four camera combinations. The design and results demonstrate a scalable, reusable, extensible method of computing AO-OCT output. This approach can either achieve real time results with an AO-OCT system capable of 1 MHz acquisition rates or be scaled to a higher accuracy mode with a fast Fourier transform of 16,384 complex values. OCT GPU Retina -- Tomography -- Research Diagnostic imaging -- Research Imaging systems in medicine Graphics processing units -- Programming Optical pattern recognition High performance computing -- Research Parallel algorithms -- Research
147	[en] AN EXPERIMENTAL INVESTIGATION OF PROBABILITY DISTRIBUTION OF SOLUTION TIME IN GRASP AND ITS APPLICATION ON THE ANALYSIS OF PARALLEL IMPLEMENTATIONS / [pt] UMA INVESTIGAÇÃO EXPERIMENTAL DA DISTRIBUIÇÃO DE PROBABILIDADE DO TEMPO DE SOLUCAO EM HEURISTICAS GRASP: E SUA APLICAÇÃO NA ANALISE DE IMPLEMENTAÇÕES PARALELAS RENATA MACHADO AIEX 13 June 2003 (has links) [pt] GRASP (Greedy Randomized Adaptive Search Procedure)é uma metaeurística de partidas múltiplas usada para obter soluções para problemas de otimização combinatória. Nesse trabalho. A metaheurística GRASP tem sido usada para obter soluções de qualidade para muitos problemas de otimização combinatória. Nesse trabalho é proposta uma metodologia para análise do comportamento da metaheurística GRASP. Também são propostas estratégias de hibridização com o religamento de caminhos. Essas estratégias foram desenvolvidas para o problema de atribuição de três índices (AP3) e para o problema de escalonamento de tarefas conhecido na literatura como job-shop schedulling problem (JSP) e são analisadas de acordo com a metodologia proposta. A metodologia para análise do comportamento do método GRASP pode ser usada para prever a partir da versão seqüencial do algoritmo, como a qualidade da solução do algoritmo implementado em paralelo irá variar. Os algoritmos GRASPs desenvolvidos para AP3 e para JSP foram paralelizados e os resultados são comparados aos resultados obtidos usando a metodologia proposta. / [en] GRASP (Greedy Randomized Adaptive Search Procedure) is a multi-start metaheuristic for combinatorial optimization problems. GRASP has been used to find quality solutions of several combinatorial optimization problems. In this work we describe a methodology for analysis of GRASP. Hybrid strategies of GRASP with path relinking are also proposed. These strategies are studied for the 3-index assignment problem (AP3) and for the job-shop schedulling problem (JSP) and are analyzed according to the methodology proposed. The methodology for analysis of GRASP is used to predict qualitatively how the quality of the solution varies in a parallel independent GRASP, using the data of the GRASP sequential version as input. The GRASPs for the AP3 and for the JSP are parallelized and the computational results are compared to the results obtained using the methodology proposed. [pt] GRASP [pt] ALGORITMOS PARALELOS [pt] METODOLOGIA PARA ANALISE DO GRASP [pt] OTIMIZACAO COMBINATORIA [pt] METAHEURISTICAS [en] GRASP [en] PARALLEL ALGORITHMS [en] JOB-SHOP SCHEDULING PROBLEM [en] 3-INDEX ASSIGNMENT PROBLEM [en] METHODOLOGY FOR ANALYSIS OF GRASP [en] COMBINATORIAL OPTIMIZATION [en] METAHEURISTICS
148	Analysis and waveform relaxation for a differential-algebraic electrical circuit model Pade, Jonas 22 July 2021 (has links) Die Hauptthemen dieser Arbeit sind einerseits eine tiefgehende Analyse von nichtlinearen differential-algebraischen Gleichungen (DAEs) vom Index 2, die aus der modifizierten Knotenanalyse (MNA) von elektrischen Schaltkreisen hervorgehen, und andererseits die Entwicklung von Konvergenzkriterien für Waveform Relaxationsmethoden zum Lösen gekoppelter Probleme. Ein Schwerpunkt in beiden genannten Themen ist die Beziehung zwischen der Topologie eines Schaltkreises und mathematischen Eigenschaften der zugehörigen DAE. Der Analyse-Teil umfasst eine detaillierte Beschreibung einer Normalform für Schaltkreis DAEs vom Index 2 und Abschätzungen, die für die Sensitivität des Schaltkreises bezüglich seiner Input-Quellen folgen. Es wird gezeigt, wie diese Abschätzungen wesentlich von der topologischen Position der Input-Quellen im Schaltkreis abhängen. Die zunehmend komplexen Schaltkreise in technologischen Geräten erfordern oftmals eine Modellierung als gekoppeltes System. Waveform relaxation (WR) empfiehlt sich zur Lösung solch gekoppelter Probleme, da sie auf die Subprobleme angepasste Lösungsmethoden und Schrittweiten ermöglicht. Es ist bekannt, dass WR zwar bei Anwendung auf gewöhnliche Differentialgleichungen konvergiert, falls diese eine Lipschitz-Bedingung erfüllen, selbiges jedoch bei DAEs nicht ohne Hinzunahme eines Kontraktivitätskriteriums sichergestellt werden kann. Wir beschreiben allgemeine Konvergenzkriterien für WR auf DAEs vom Index 2. Für den Fall von Schaltkreisen, die entweder mit anderen Schaltkreisen oder mit elektromagnetischen Feldern verkoppelt sind, leiten wir außerdem hinreichende topologische Konvergenzkriterien her, die anhand von Beispielen veranschaulicht werden. Weiterhin werden die Konvergenzraten des Jacobi WR Verfahrens und des Gauss-Seidel WR Verfahrens verglichen. Simulationen von einfachen Beispielsystemen zeigen drastische Unterschiede des WR-Konvergenzverhaltens, abhängig davon, ob die Konvergenzbedingungen erfüllt sind oder nicht. / The main topics of this thesis are firstly a thorough analysis of nonlinear differential-algebraic equations (DAEs) of index 2 which arise from the modified nodal analysis (MNA) for electrical circuits and secondly the derivation of convergence criteria for waveform relaxation (WR) methods on coupled problems. In both topics, a particular focus is put on the relations between a circuit's topology and the mathematical properties of the corresponding DAE. The analysis encompasses a detailed description of a normal form for circuit DAEs of index 2 and consequences for the sensitivity of the circuit with respect to its input source terms. More precisely, we provide bounds which describe how strongly changes in the input sources of the circuit affect its behaviour. Crucial constants in these bounds are determined in terms of the topological position of the input sources in the circuit. The increasingly complex electrical circuits in technological devices often call for coupled systems modelling. Allowing for each subsystem to be solved by dedicated numerical solvers and time scales, WR is an adequate method in this setting. It is well-known that while WR converges on ordinary differential equations if a Lipschitz condition is satisfied, an additional convergence criterion is required to guarantee convergence on DAEs. We present general convergence criteria for WR on higher index DAEs. Furthermore, based on our results of the analysis part, we derive topological convergence criteria for coupled circuit/circuit problems and field/circuit problems. Examples illustrate how to practically check if the criteria are satisfied. If a sufficient convergence criterion holds, we specify at which rate of convergence the Jacobi and Gauss-Seidel WR methods converge. Simulations of simple benchmark systems illustrate the drastically different convergence behaviour of WR depending on whether or not the circuit topological convergence conditions are satisfied. elektrische Schaltkreise differential-algebraische Gleichung nichtlineare DAE vom Index 2 gekoppelte Schaltkreise Waveform Relaxation Parallele Algorithmen modifizierte Knotenanalyse MNA Netzwerk-Topologie Konvergenzkriterien Cosimulation electrical circuits differential-algebraic equation nonlinear index 2 DAE coupled circuits field/circuit waveform relaxation modified nodal analysis MNA network topology convergence criteria cosimulation parallel algorithms 500 Naturwissenschaften und Mathematik 510 Mathematik 518 Numerische Analysis SK 920 ZN 5310 ddc:500 ddc:510 ddc:518

Search results