Global ETD Search

1	On inexact Newton directions in interior point methods for linear optimization Al-Jeiroudi, Ghussoun January 2009 (has links) In each iteration of the interior point method (IPM) at least one linear system has to be solved. The main computational effort of IPMs consists in the computation of these linear systems. Solving the corresponding linear systems with a direct method becomes very expensive for large scale problems. In this thesis, we have been concerned with using an iterative method for solving the reduced KKT systems arising in IPMs for linear programming. The augmented system form of this linear system has a number of advantages, notably a higher degree of sparsity than the normal equations form. We design a block triangular preconditioner for this system which is constructed by using a nonsingular basis matrix identified from an estimate of the optimal partition in the linear program. We use the preconditioned conjugate gradients (PCG) method to solve the augmented system. Although the augmented system is indefinite, short recurrence iterative methods such as PCG can be applied to indefinite system in certain situations. This approach has been implemented within the HOPDM interior point solver. The KKT system is solved approximately. Therefore, it becomes necessary to study the convergence of IPM for this inexact case. We present the convergence analysis of the inexact infeasible path-following algorithm, prove the global convergence of this method and provide complexity analysis. 519.72
2	Preconditioned iterative methods for a class of nonlinear eigenvalue problems Solov'ëv, Sergey I. 31 August 2006 (has links) (PDF) In this paper we develop new preconditioned iterative methods for solving monotone nonlinear eigenvalue problems. We investigate the convergence and derive grid-independent error estimates for these methods. Numerical experiments demonstrate the practical effectiveness of the proposed methods for a model problem. preconditioned iterative method symmetric eigenvalue problem ddc:510 Gradientenverfahren Konjugierte-Gradienten-Methode Nichtlineares Eigenwertproblem Präkonditionierung
3	Acceleration of Computer Based Simulation, Image Processing, and Data Analysis Using Computer Clusters with Heterogeneous Accelerators Chen, Chong January 2016 (has links) No description available. Computer Engineering parallel computing distributed computing GPGPU Xeon Phi Preconditioned Iterative Solver ALS bilateral filtering
4	Singular Value Computation and Subspace Clustering Liang, Qiao 01 January 2015 (has links) In this dissertation we discuss two problems. In the first part, we consider the problem of computing a few extreme eigenvalues of a symmetric definite generalized eigenvalue problem or a few extreme singular values of a large and sparse matrix. The standard method of choice of computing a few extreme eigenvalues of a large symmetric matrix is the Lanczos or the implicitly restarted Lanczos method. These methods usually employ a shift-and-invert transformation to accelerate the speed of convergence, which is not practical for truly large problems. With this in mind, Golub and Ye proposes an inverse-free preconditioned Krylov subspace method, which uses preconditioning instead of shift-and-invert to accelerate the convergence. To compute several eigenvalues, Wielandt is used in a straightforward manner. However, the Wielandt deflation alters the structure of the problem and may cause some difficulties in certain applications such as the singular value computations. So we first propose to consider a deflation by restriction method for the inverse-free Krylov subspace method. We generalize the original convergence theory for the inverse-free preconditioned Krylov subspace method to justify this deflation scheme. We next extend the inverse-free Krylov subspace method with deflation by restriction to the singular value problem. We consider preconditioning based on robust incomplete factorization to accelerate the convergence. Numerical examples are provided to demonstrate efficiency and robustness of the new algorithm. In the second part of this thesis, we consider the so-called subspace clustering problem, which aims for extracting a multi-subspace structure from a collection of points lying in a high-dimensional space. Recently, methods based on self expressiveness property (SEP) such as Sparse Subspace Clustering and Low Rank Representations have been shown to enjoy superior performances than other methods. However, methods with SEP may result in representations that are not amenable to clustering through graph partitioning. We propose a method where the points are expressed in terms of an orthonormal basis. The orthonormal basis is optimally chosen in the sense that the representation of all points is sparsest. Numerical results are given to illustrate the effectiveness and efficiency of this method. singular value decomposition machine learning subspace clustering Numerical Analysis and Computation Theory and Algorithms
5	NEW COMPUTATIONAL METHODS FOR OPTIMAL CONTROL OF PARTIAL DIFFERENTIAL EQUATIONS Liu, Jun 01 August 2015 (has links) Partial differential equations are the chief means of providing mathematical models in science, engineering and other fields. Optimal control of partial differential equations (PDEs) has tremendous applications in engineering and science, such as shape optimization, image processing, fluid dynamics, and chemical processes. In this thesis, we develop and analyze several efficient numerical methods for the optimal control problems governed by elliptic PDE, parabolic PDE, and wave PDE, respectively. The thesis consists of six chapters. In Chapter 1, we briefly introduce a few motivating applications and summarize some theoretical and computational foundations of our following developed approaches. In Chapter 2, we establish a new multigrid algorithm to accelerate the semi-smooth Newton method that is applied to the first-order necessary optimality system arising from semi-linear control-constrained elliptic optimal control problems. Under suitable assumptions, the discretized Jacobian matrix is proved to have a uniformly bounded inverse with respect to mesh size. Different from current available approaches, a new strategy that leads to a robust multigrid solver is employed to define the coarse grid operator. Numerical simulations are provided to illustrate the efficiency of the proposed method, which shows to be computationally more efficient than the popular full approximation storage (FAS) multigrid method. In particular, our proposed approach achieves a mesh-independent convergence and its performance is highly robust with respect to the regularization parameter. In Chaper 3, we present a new second-order leapfrog finite difference scheme in time for solving the first-order necessary optimality system of the linear parabolic optimal control problems. The new leapfrog scheme is shown to be unconditionally stable and it provides a second-order accuracy, while the classical leapfrog scheme usually is well-known to be unstable. A mathematical proof for the convergence of the proposed scheme is provided under a suitable norm. Moreover, the proposed leapfrog scheme gives a favorable structure that leads to an effective implementation of a fast solver under the multigrid framework. Numerical examples show that the proposed scheme significantly outperforms the widely used second-order backward time differentiation approach, and the resultant fast solver demonstrates a mesh-independent convergence as well as a linear time complexity. In Chapter 4, we develop a new semi-smooth Newton multigrid algorithm for solving the discretized first-order necessary optimality system that characterizes the optimal solution of semi-linear parabolic PDE optimal control problems with control constraints. A new leapfrog discretization scheme in time associated with the standard five-point stencil in space is established to achieve a second-order accuracy. The convergence (or unconditional stability) of the proposed scheme is proved when time-periodic solutions are considered. Moreover, the derived well-structured discretized Jacobian matrices greatly facilitate the development of an effective smoother in our multigrid algorithm. Numerical simulations are provided to illustrate the effectiveness of the proposed method, which validates the second-order accuracy in solution approximations as well as the optimal linear complexity of computing time. In Chapter 5, we offer a new implicit finite difference scheme in time for solving the first-order necessary optimality system arising in optimal control of wave equations. With a five-point central finite difference scheme in space, the full discretization is proved to be unconditionally convergent with a second-order accuracy, which is not restricted by the classical Courant-Friedrichs-Lewy (CFL) stability condition on the spatial and temporal step sizes. Moreover, based on its advantageous developed structure, an efficient preconditioned Krylov subspace method is provided and analyzed for solving the discretized sparse linear system. Numerical examples are presented to confirm our theoretical conclusions and demonstrate the promising performance of proposed preconditioned iterative solver. Finally, brief summaries and future research perspectives are given in Chapter 6. finite difference scheme multigrid method optimal control partial differential equations preconditioned Krylov subspace method semi-smooth Newton method
6	Preconditioned iterative methods for a class of nonlinear eigenvalue problems Solov'ëv, Sergey I. 31 August 2006 (has links) In this paper we develop new preconditioned iterative methods for solving monotone nonlinear eigenvalue problems. We investigate the convergence and derive grid-independent error estimates for these methods. Numerical experiments demonstrate the practical effectiveness of the proposed methods for a model problem. info:eu-repo/classification/ddc/510 ddc:510 Gradientenverfahren Konjugierte-Gradienten-Methode Nichtlineares Eigenwertproblem Präkonditionierung preconditioned iterative method symmetric eigenvalue problem
7	Resolução de um problema térmico inverso utilizando processamento paralelo em arquiteturas de memória compartilhada / Resolution of an inverse thermal problem using parallel processing on shared memory architectures Ansoni, Jonas Laerte 03 September 2010 (has links) A programação paralela tem sido freqüentemente adotada para o desenvolvimento de aplicações que demandam alto desempenho computacional. Com o advento das arquiteturas multi-cores e a existência de diversos níveis de paralelismo é importante definir estratégias de programação paralela que tirem proveito desse poder de processamento nessas arquiteturas. Neste contexto, este trabalho busca avaliar o desempenho da utilização das arquiteturas multi-cores, principalmente o oferecido pelas unidades de processamento gráfico (GPUs) e CPUs multi-cores na resolução de um problema térmico inverso. Algoritmos paralelos para a GPU e CPU foram desenvolvidos utilizando respectivamente as ferramentas de programação em arquiteturas de memória compartilhada NVIDIA CUDA (Compute Unified Device Architecture) e a API POSIX Threads. O algoritmo do método do gradiente conjugado pré-condicionado para resolução de sistemas lineares esparsos foi implementado totalmente no espaço da memória global da GPU em CUDA. O algoritmo desenvolvido foi avaliado em dois modelos de GPU, os quais se mostraram mais eficientes, apresentando um speedup de quatro vezes que a versão serial do algoritmo. A aplicação paralela em POSIX Threads foi avaliada em diferentes CPUs multi-cores com distintas microarquiteturas. Buscando um maior desempenho do código paralelizado foram utilizados flags de otimização as quais se mostraram muito eficientes na aplicação desenvolvida. Desta forma o código paralelizado com o auxílio das flags de otimização chegou a apresentar tempos de processamento cerca de doze vezes mais rápido que a versão serial no mesmo processador sem nenhum tipo de otimização. Assim tanto a abordagem utilizando a GPU como um co-processador genérico a CPU como a aplicação paralela empregando as CPUs multi-cores mostraram-se ferramentas eficientes para a resolução do problema térmico inverso. / Parallel programming has been frequently adopted for the development of applications that demand high-performance computing. With the advent of multi-cores architectures and the existence of several levels of parallelism are important to define programming strategies that take advantage of parallel processing power in these architectures. In this context, this study aims to evaluate the performance of architectures using multi-cores, mainly those offered by the graphics processing units (GPUs) and CPU multi-cores in the resolution of an inverse thermal problem. Parallel algorithms for the GPU and CPU were developed respectively, using the programming tools in shared memory architectures, NVIDIA CUDA (Compute Unified Device Architecture) and the POSIX Threads API. The algorithm of the preconditioned conjugate gradient method for solving sparse linear systems entirely within the global memory of the GPU was implemented by CUDA. It evaluated the two models of GPU, which proved more efficient by having a speedup was four times faster than the serial version of the algorithm. The parallel application in POSIX Threads was evaluated in different multi-core CPU with different microarchitectures. Optimization flags were used to achieve a higher performance of the parallelized code. As those were efficient in the developed application, the parallelized code presented processing times about twelve times faster than the serial version on the same processor without any optimization. Thus both the approach using GPU as a coprocessor to the CPU as a generic parallel application using the multi-core CPU proved to be more efficient tools for solving the inverse thermal problem. GPGPU CUDA GPGPU CUDA Gradiente conjugado pré-condicionado Matriz esparsa Parallel processing POSIX threads POSIX threads Processamento paralelo Sparse numerical solver
8	Optimal iterative solvers for linear systems with stochastic PDE origins : balanced black-box stopping tests Pranjal, Pranjal January 2017 (has links) The central theme of this thesis is the design of optimal balanced black-box stopping criteria in iterative solvers of symmetric positive-definite, symmetric indefinite, and nonsymmetric linear systems arising from finite element approximation of stochastic (parametric) partial differential equations. For a given stochastic and spatial approximation, it is known that iteratively solving the corresponding linear(ized) system(s) of equations to too tight algebraic error tolerance results in a wastage of computational resources without decreasing the usually unknown approximation error. In order to stop optimally-by avoiding unnecessary computations and premature stopping-algebraic error and a posteriori approximation error estimate must be balanced at the optimal stopping iteration. Efficient and reliable a posteriori error estimators do exist for close estimation of the approximation error in a finite element setting. But the algebraic error is generally unknown since the exact algebraic solution is not usually available. Obtaining tractable upper and lower bounds on the algebraic error in terms of a readily computable and monotonically decreasing quantity (if any) of the chosen iterative solver is the distinctive feature of the designed optimal balanced stopping strategy. Moreover, this work states the exact constants, that is, there are no user-defined parameters in the optimal balanced stopping tests. Hence, an iterative solver incorporating the optimal balanced stopping methodology that is presented here will be a black-box iterative solver. Typically, employing such a stopping methodology would lead to huge computational savings and in any case would definitely rule out premature stopping. The constants in the devised optimal balanced black-box stopping tests in MINRES solver for solving symmetric positive-definite and symmetric indefinite linear systems can be estimated cheaply on-the- fly. The contribution of this thesis goes one step further for the nonsymmetric case in the sense that it not only provides an optimal balanced black-box stopping test in a memory-expensive Krylov solver like GMRES but it also presents an optimal balanced black-box stopping test in memory-inexpensive Krylov solvers such as BICGSTAB(L), TFQMR etc. Currently, little convergence theory exists for the memory-inexpensive Krylov solvers and hence devising stopping criteria for them is an active field of research. Also, an optimal balanced black-box stopping criterion is proposed for nonlinear (Picard or Newton) iterative method that is used for solving the finite dimensional Navier-Stokes equations. The optimal balanced black-box stopping methodology presented in this thesis can be generalized for any iterative solver of a linear(ized) system arising from numerical approximation of a partial differential equation. The only prerequisites for this purpose are the existence of a cheap and tight a posteriori error estimator for the approximation error along with cheap and tractable bounds on the algebraic error. 510
9	Algorithm Design and Analysis for Large-Scale Semidefinite Programming and Nonlinear Programming Lu, Zhaosong 24 June 2005 (has links) The limiting behavior of weighted paths associated with the semidefinite program (SDP) map $X^{1/2}SX^{1/2}$ was studied and some applications to error bound analysis and superlinear convergence of a class of primal-dual interior-point methods were provided. A new approach for solving large-scale well-structured sparse SDPs via a saddle point mirror-prox algorithm with ${cal O}(epsilon^{-1})$ efficiency was developed based on exploiting sparsity structure and reformulating SDPs into smooth convex-concave saddle point problems. An iterative solver-based long-step primal-dual infeasible path-following algorithm for convex quadratic programming (CQP) was developed. The search directions of this algorithm were computed by means of a preconditioned iterative linear solver. A uniform bound, depending only on the CQP data, on the number of iterations performed by a preconditioned iterative linear solver was established. A polynomial bound on the number of iterations of this algorithm was also obtained. One efficient ``nearly exact' type of method for solving large-scale ``low-rank' trust region subproblems was proposed by completely avoiding the computations of Cholesky or partial Cholesky factorizations. A computational study of this method was also provided by applying it to solve some large-scale nonlinear programming problems. Semidefinite program Weighted paths Trust region subproblem Maximum weight basis preconditioner Preconditioned iterative linear solver Convex quadratic program Smooth saddle point problem Mirror-prox algorithm
10	Resolução de um problema térmico inverso utilizando processamento paralelo em arquiteturas de memória compartilhada / Resolution of an inverse thermal problem using parallel processing on shared memory architectures Jonas Laerte Ansoni 03 September 2010 (has links) A programação paralela tem sido freqüentemente adotada para o desenvolvimento de aplicações que demandam alto desempenho computacional. Com o advento das arquiteturas multi-cores e a existência de diversos níveis de paralelismo é importante definir estratégias de programação paralela que tirem proveito desse poder de processamento nessas arquiteturas. Neste contexto, este trabalho busca avaliar o desempenho da utilização das arquiteturas multi-cores, principalmente o oferecido pelas unidades de processamento gráfico (GPUs) e CPUs multi-cores na resolução de um problema térmico inverso. Algoritmos paralelos para a GPU e CPU foram desenvolvidos utilizando respectivamente as ferramentas de programação em arquiteturas de memória compartilhada NVIDIA CUDA (Compute Unified Device Architecture) e a API POSIX Threads. O algoritmo do método do gradiente conjugado pré-condicionado para resolução de sistemas lineares esparsos foi implementado totalmente no espaço da memória global da GPU em CUDA. O algoritmo desenvolvido foi avaliado em dois modelos de GPU, os quais se mostraram mais eficientes, apresentando um speedup de quatro vezes que a versão serial do algoritmo. A aplicação paralela em POSIX Threads foi avaliada em diferentes CPUs multi-cores com distintas microarquiteturas. Buscando um maior desempenho do código paralelizado foram utilizados flags de otimização as quais se mostraram muito eficientes na aplicação desenvolvida. Desta forma o código paralelizado com o auxílio das flags de otimização chegou a apresentar tempos de processamento cerca de doze vezes mais rápido que a versão serial no mesmo processador sem nenhum tipo de otimização. Assim tanto a abordagem utilizando a GPU como um co-processador genérico a CPU como a aplicação paralela empregando as CPUs multi-cores mostraram-se ferramentas eficientes para a resolução do problema térmico inverso. / Parallel programming has been frequently adopted for the development of applications that demand high-performance computing. With the advent of multi-cores architectures and the existence of several levels of parallelism are important to define programming strategies that take advantage of parallel processing power in these architectures. In this context, this study aims to evaluate the performance of architectures using multi-cores, mainly those offered by the graphics processing units (GPUs) and CPU multi-cores in the resolution of an inverse thermal problem. Parallel algorithms for the GPU and CPU were developed respectively, using the programming tools in shared memory architectures, NVIDIA CUDA (Compute Unified Device Architecture) and the POSIX Threads API. The algorithm of the preconditioned conjugate gradient method for solving sparse linear systems entirely within the global memory of the GPU was implemented by CUDA. It evaluated the two models of GPU, which proved more efficient by having a speedup was four times faster than the serial version of the algorithm. The parallel application in POSIX Threads was evaluated in different multi-core CPU with different microarchitectures. Optimization flags were used to achieve a higher performance of the parallelized code. As those were efficient in the developed application, the parallelized code presented processing times about twelve times faster than the serial version on the same processor without any optimization. Thus both the approach using GPU as a coprocessor to the CPU as a generic parallel application using the multi-core CPU proved to be more efficient tools for solving the inverse thermal problem. GPGPU CUDA Gradiente conjugado pré-condicionado Matriz esparsa POSIX threads Processamento paralelo GPGPU CUDA Parallel processing POSIX threads Sparse numerical solver

Search results