Global ETD Search

1	Isometry and convexity in dimensionality reduction Vasiloglou, Nikolaos 30 March 2009 (has links) The size of data generated every year follows an exponential growth. The number of data points as well as the dimensions have increased dramatically the past 15 years. The gap between the demand from the industry in data processing and the solutions provided by the machine learning community is increasing. Despite the growth in memory and computational power, advanced statistical processing on the order of gigabytes is beyond any possibility. Most sophisticated Machine Learning algorithms require at least quadratic complexity. With the current computer model architecture, algorithms with higher complexity than linear O(N) or O(N logN) are not considered practical. Dimensionality reduction is a challenging problem in machine learning. Often data represented as multidimensional points happen to have high dimensionality. It turns out that the information they carry can be expressed with much less dimensions. Moreover the reduced dimensions of the data can have better interpretability than the original ones. There is a great variety of dimensionality reduction algorithms under the theory of Manifold Learning. Most of the methods such as Isomap, Local Linear Embedding, Local Tangent Space Alignment, Diffusion Maps etc. have been extensively studied under the framework of Kernel Principal Component Analysis (KPCA). In this dissertation we study two current state of the art dimensionality reduction methods, Maximum Variance Unfolding (MVU) and Non-Negative Matrix Factorization (NMF). These two dimensionality reduction methods do not fit under the umbrella of Kernel PCA. MVU is cast as a Semidefinite Program, a modern convex nonlinear optimization algorithm, that offers more flexibility and power compared to iv KPCA. Although MVU and NMF seem to be two disconnected problems, we show that there is a connection between them. Both are special cases of a general nonlinear factorization algorithm that we developed. Two aspects of the algorithms are of particular interest: computational complexity and interpretability. In other words computational complexity answers the question of how fast we can find the best solution of MVU/NMF for large data volumes. Since we are dealing with optimization programs, we need to find the global optimum. Global optimum is strongly connected with the convexity of the problem. Interpretability is strongly connected with local isometry1 that gives meaning in relationships between data points. Another aspect of interpretability is association of data with labeled information. The contributions of this thesis are the following: 1. MVU is modified so that it can scale more efficient. Results are shown on 1 million speech datasets. Limitations of the method are highlighted. 2. An algorithm for fast computations for the furthest neighbors is presented for the first time in the literature. 3. Construction of optimal kernels for Kernel Density Estimation with modern convex programming is presented. For the first time we show that the Leave One Cross Validation (LOOCV) function is quasi-concave. 4. For the first time NMF is formulated as a convex optimization problem 5. An algorithm for the problem of Completely Positive Matrix Factorization is presented. 6. A hybrid algorithm of MVU and NMF the isoNMF is presented combining advantages of both methods. 7. The Isometric Separation Maps (ISM) a variation of MVU that contains classification information is presented. 8. Large scale nonlinear dimensional analysis on the TIMIT speech database is performed. 9. A general nonlinear factorization algorithm is presented based on sequential convex programming. Despite the efforts to scale the proposed methods up to 1 million data points in reasonable time, the gap between the industrial demand and the current state of the art is still orders of magnitude wide. Isometry Convexity Dimensionality reduction Factorizations Maximum variance unfolding Semidefinite programing Dimensional analysis Isometrics (Mathematics) Convex domains
2	Block-decomposition and accelerated gradient methods for large-scale convex optimization Ortiz Diaz, Camilo 08 June 2015 (has links) In this thesis, we develop block-decomposition (BD) methods and variants of accelerated *9gradient methods for large-scale conic programming and convex optimization, respectively. The BD methods, discussed in the first two parts of this thesis, are inexact versions of proximal-point methods applied to two-block-structured inclusion problems. The adaptive accelerated methods, presented in the last part of this thesis, can be viewed as new variants of Nesterov's optimal method. In an effort to improve their practical performance, these methods incorporate important speed-up refinements motivated by theoretical iteration-complexity bounds and our observations from extensive numerical experiments. We provide several benchmarks on various important problem classes to demonstrate the efficiency of the proposed methods compared to the most competitive ones proposed earlier in the literature. In the first part of this thesis, we consider exact BD first-order methods for solving conic semidefinite programming (SDP) problems and the more general problem that minimizes the sum of a convex differentiable function with Lipschitz continuous gradient, and two other proper closed convex (possibly, nonsmooth) functions. More specifically, these problems are reformulated as two-block monotone inclusion problems and exact BD methods, namely the ones that solve both proximal subproblems exactly, are used to solve them. In addition to being able to solve standard form conic SDP problems, the latter approach is also able to directly solve specially structured non-standard form conic programming problems without the need to add additional variables and/or constraints to bring them into standard form. Several ingredients are introduced to speed-up the BD methods in their pure form such as: adaptive (aggressive) choices of stepsizes for performing the extragradient step; and dynamic updates of scaled inner products to balance the blocks. Finally, computational results on several classes of SDPs are presented showing that the exact BD methods outperform the three most competitive codes for solving large-scale conic semidefinite programming. In the second part of this thesis, we present an inexact BD first-order method for solving standard form conic SDP problems which avoids computations of exact projections onto the manifold defined by the affine constraints and, as a result, is able to handle extra large-scale SDP instances. In this BD method, while the proximal subproblem corresponding to the first block is solved exactly, the one corresponding to the second block is solved inexactly in order to avoid finding the exact solution of a linear system corresponding to the manifolds consisting of both the primal and dual affine feasibility constraints. Our implementation uses the conjugate gradient method applied to a reduced positive definite dual linear system to obtain inexact solutions of the latter augmented primal-dual linear system. In addition, the inexact BD method incorporates a new dynamic scaling scheme that uses two scaling factors to balance three inclusions comprising the optimality conditions of the conic SDP. Finally, we present computational results showing the efficiency of our method for solving various extra large SDP instances, several of which cannot be solved by other existing methods, including some with at least two million constraints and/or fifty million non-zero coefficients in the affine constraints. In the last part of this thesis, we consider an adaptive accelerated gradient method for a general class of convex optimization problems. More specifically, we present a new accelerated variant of Nesterov's optimal method in which certain acceleration parameters are adaptively (and aggressively) chosen so as to: preserve the theoretical iteration-complexity of the original method; and substantially improve its practical performance in comparison to the other existing variants. Computational results are presented to demonstrate that the proposed adaptive accelerated method performs quite well compared to other variants proposed earlier in the literature. Semidefinite programing Large-scale Conjugate gradient Accelerated gradient methods Convex optimization Quadratic programming Complexity Proximal Extragradient Block-decomposition Conic optimization

Search results

Isometry and convexity in dimensionality reduction

Block-decomposition and accelerated gradient methods for large-scale convex optimization