1 |
LINEAR CONVERGENCE OF DISTRIBUTED GRADIENT TRACKING SCHEMES UNDER THE KL PROPERTYTejaskumar Pradipbhai Tamboli (12856589) 10 June 2022 (has links)
<p>We study decentralized multiagent optimization over networks modeled as undirected</p>
<p>graphs. The optimization problem consists of minimizing a (non convex) smooth function</p>
<p>plus a convex extended-value function, which enforces constraints or extra structure on the</p>
<p>solution (e.g., sparsity, low-rank). We further assume that the objective function satisfies the</p>
<p>Kurdyka- Lojasiewicz (KL) property, with exponent in [0, 1). The KL property is satisfied</p>
<p>by several (non convex) functions of practical interest, e.g., arising from machine learning</p>
<p>applications; in the centralized setting, it permits to achieve strong convergence guarantees.</p>
<p>Here we establish a first convergence result of the same type for distributed algorithms,</p>
<p>specifically the distributed gradient-tracking based algorithm SONATA, first proposed in</p>
<p>[ 1 ]. When exponent is in (0, 1/2], the sequence generated by SONATA is proved to converge to a</p>
<p>stationary solution of the problem at an R-linear rate whereas sublinear rate is certified</p>
<p>when KL exponent is in (1/2, 1). This matches the convergence behaviour of centralized proximal-gradient</p>
<p>algorithms. Numerical results validate our theoretical findings.</p>
|
2 |
Operator splitting methods for convex optimization : analysis and implementationBanjac, Goran January 2018 (has links)
Convex optimization problems are a class of mathematical problems which arise in numerous applications. Although interior-point methods can in principle solve these problems efficiently, they may become intractable for solving large-scale problems or be unsuitable for real-time embedded applications. Iterations of operator splitting methods are relatively simple and computationally inexpensive, which makes them suitable for these applications. However, some of their known limitations are slow asymptotic convergence, sensitivity to ill-conditioning, and inability to detect infeasible problems. The aim of this thesis is to better understand operator splitting methods and to develop reliable software tools for convex optimization. The main analytical tool in our investigation of these methods is their characterization as the fixed-point iteration of a nonexpansive operator. The fixed-point theory of nonexpansive operators has been studied for several decades. By exploiting the properties of such an operator, it is possible to show that the alternating direction method of multipliers (ADMM) can detect infeasible problems. Although ADMM iterates diverge when the problem at hand is unsolvable, the differences between subsequent iterates converge to a constant vector which is also a certificate of primal and/or dual infeasibility. Reliable termination criteria for detecting infeasibility are proposed based on this result. Similar ideas are used to derive necessary and sufficient conditions for linear (geometric) convergence of an operator splitting method and a bound on the achievable convergence rate. The new bound turns out to be tight for the class of averaged operators. Next, the OSQP solver is presented. OSQP is a novel general-purpose solver for quadratic programs (QPs) based on ADMM. The solver is very robust, is able to detect infeasible problems, and has been extensively tested on many problem instances from a wide variety of application areas. Finally, operator splitting methods can also be effective in nonconvex optimization. The developed algorithm significantly outperforms a common approach based on convex relaxation of the original nonconvex problem.
|
3 |
Studies on block coordinate gradient methods for nonlinear optimization problems with separable structure / 分離可能な構造をもつ非線形最適化問題に対するブロック座標勾配法の研究Hua, Xiaoqin 23 March 2015 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(情報学) / 甲第19123号 / 情博第569号 / 新制||情||100(附属図書館) / 32074 / 京都大学大学院情報学研究科数理工学専攻 / (主査)教授 山下 信雄, 教授 中村 佳正, 教授 田中 利幸 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM
|
4 |
HIGH-DIMENSIONAL INFERENCE OVER NETWORKS: STATISTICAL AND COMPUTATIONAL GUARANTEESYao Ji (19697335) 19 September 2024 (has links)
<p dir="ltr">Distributed optimization problems defined over mesh networks are ubiquitous in signal processing, machine learning, and control. In contrast to centralized approaches where all information and computation resources are available at a centralized server, agents on a distributed system can only use locally available information. As a result, efforts have been put into the design of efficient distributed algorithms that take into account the communication constraints and make coordinated decisions in a fully distributed manner from a pure optimization perspective. Given the massive sample size and high-dimensionality generated by distributed systems such as social media, sensor networks, and cloud-based databases, it is essential to understand the statistical and computational guarantees of distributed algorithms to solve such high-dimensional problems over a mesh network.</p><p dir="ltr">A goal of this thesis is a first attempt at studying the behavior of distributed methods in the high-dimensional regime. It consists of two parts: (I) distributed LASSO and (II) distributed stochastic sparse recovery.</p><p dir="ltr">In Part (I), we start by studying linear regression from data distributed over a network of agents (with no master node) by means of LASSO estimation, in high-dimension, which allows the ambient dimension to grow faster than the sample size. While there is a vast literature of distributed algorithms applicable to the problem, statistical and computational guarantees of most of them remain unclear in high dimensions. This thesis provides a first statistical study of the Distributed Gradient Descent (DGD) in the Adapt-Then-Combine (ATC) form. Our theory shows that, under standard notions of restricted strong convexity and smoothness of the loss functions--which hold with high probability for standard data generation models--suitable conditions on the network connectivity and algorithm tuning, DGD-ATC converges globally at a linear rate to an estimate that is within the centralized statistical precision of the model. In the worst-case scenario, the total number of communications to statistical optimality grows logarithmically with the ambient dimension, which improves on the communication complexity of DGD in the Combine-Then-Adapt (CTA) form, scaling linearly with the dimension. This reveals that mixing gradient information among agents, as DGD-ATC does, is critical in high-dimensions to obtain favorable rate scalings. </p><p dir="ltr">In Part (II), we focus on addressing the problem of distributed stochastic sparse recovery through stochastic optimization. We develop and analyze stochastic optimization algorithms for problems over a network, modeled as an undirected graph (with no centralized node), where the expected loss is strongly convex with respect to the Euclidean norm, and the optimum is sparse. Assuming agents only have access to unbiased estimates of the gradients of the underlying expected objective, and stochastic gradients are sub-Gaussian, we use distributed stochastic dual averaging (DSDA) as a building block to develop a fully decentralized restarting procedure for recovery of sparse solutions over a network. We show that with high probability, the iterates generated by all agents linearly converge to an approximate solution, eliminating fast the initial error; and then converge sublinearly to the exact sparse solution in the steady-state stages owing to observation noise. The algorithm asymptotically achieves the optimal convergence rate and favorable dimension dependence enjoyed by a non-Euclidean centralized scheme. Further, we precisely identify its non-asymptotic convergence rate as a function of characteristics of the objective functions and the network, and we characterize the transient time needed for the algorithm to approach the optimal rate of convergence. We illustrate the performance of the algorithm in application to classical problems of sparse linear regression, sparse logistic regression and low rank matrix recovery. Numerical experiments demonstrate the tightness of the theoretical results.</p>
|
5 |
Local and Global Analysis of Relaxed Douglas-Rachford for Nonconvex Feasibility ProblemsMartins, Anna-Lena 19 March 2019 (has links)
No description available.
|
6 |
Fixed Point Algorithms for Nonconvex Feasibility with ApplicationsHesse, Robert 14 July 2014 (has links)
No description available.
|
7 |
Analysis of Randomized Adaptive Algorithms for Black-Box Continuous Constrained Optimization / Analyse d'algorithmes stochastiques adaptatifs pour l'optimisation numérique boîte-noire avec contraintesAtamna, Asma 25 January 2017 (has links)
On s'intéresse à l'étude d'algorithmes stochastiques pour l'optimisation numérique boîte-noire. Dans la première partie de cette thèse, on présente une méthodologie pour évaluer efficacement des stratégies d'adaptation du step-size dans le cas de l'optimisation boîte-noire sans contraintes. Le step-size est un paramètre important dans les algorithmes évolutionnaires tels que les stratégies d'évolution; il contrôle la diversité de la population et, de ce fait, joue un rôle déterminant dans la convergence de l'algorithme. On présente aussi les résultats empiriques de la comparaison de trois méthodes d'adaptation du step-size. Ces algorithmes sont testés sur le testbed BBOB (black-box optimization benchmarking) de la plateforme COCO (comparing continuous optimisers). Dans la deuxième partie de cette thèse, sont présentées nos contributions dans le domaine de l'optimisation boîte-noire avec contraintes. On analyse la convergence linéaire d'algorithmes stochastiques adaptatifs pour l'optimisation sous contraintes dans le cas de contraintes linéaires, gérées avec une approche Lagrangien augmenté adaptative. Pour ce faire, on étend l'analyse par chaines de Markov faite dans le cas d'optimisation sans contraintes au cas avec contraintes: pour chaque algorithme étudié, on exhibe une classe de fonctions pour laquelle il existe une chaine de Markov homogène telle que la stabilité de cette dernière implique la convergence linéaire de l'algorithme. La convergence linéaire est déduite en appliquant une loi des grands nombres pour les chaines de Markov, sous l'hypothèse de la stabilité. Dans notre cas, la stabilité est validée empiriquement. / We investigate various aspects of adaptive randomized (or stochastic) algorithms for both constrained and unconstrained black-box continuous optimization. The first part of this thesis focuses on step-size adaptation in unconstrained optimization. We first present a methodology for assessing efficiently a step-size adaptation mechanism that consists in testing a given algorithm on a minimal set of functions, each reflecting a particular difficulty that an efficient step-size adaptation algorithm should overcome. We then benchmark two step-size adaptation mechanisms on the well-known BBOB noiseless testbed and compare their performance to the one of the state-of-the-art evolution strategy (ES), CMA-ES, with cumulative step-size adaptation. In the second part of this thesis, we investigate linear convergence of a (1 + 1)-ES and a general step-size adaptive randomized algorithm on a linearly constrained optimization problem, where an adaptive augmented Lagrangian approach is used to handle the constraints. To that end, we extend the Markov chain approach used to analyze randomized algorithms for unconstrained optimization to the constrained case. We prove that when the augmented Lagrangian associated to the problem, centered at the optimum and the corresponding Lagrange multipliers, is positive homogeneous of degree 2, then for algorithms enjoying some invariance properties, there exists an underlying homogeneous Markov chain whose stability (typically positivity and Harris-recurrence) leads to linear convergence to both the optimum and the corresponding Lagrange multipliers. We deduce linear convergence under the aforementioned stability assumptions by applying a law of large numbers for Markov chains. We also present a general framework to design an augmented-Lagrangian-based adaptive randomized algorithm for constrained optimization, from an adaptive randomized algorithm for unconstrained optimization.
|
Page generated in 0.0648 seconds