Global ETD Search

31	Accelerating Scientific Applications using High Performance Dense and Sparse Linear Algebra Kernels on GPUs Abdelfattah, Ahmad 15 January 2015 (has links) High performance computing (HPC) platforms are evolving to more heterogeneous configurations to support the workloads of various applications. The current hardware landscape is composed of traditional multicore CPUs equipped with hardware accelerators that can handle high levels of parallelism. Graphical Processing Units (GPUs) are popular high performance hardware accelerators in modern supercomputers. GPU programming has a different model than that for CPUs, which means that many numerical kernels have to be redesigned and optimized specifically for this architecture. GPUs usually outperform multicore CPUs in some compute intensive and massively parallel applications that have regular processing patterns. However, most scientific applications rely on crucial memory-bound kernels and may witness bottlenecks due to the overhead of the memory bus latency. They can still take advantage of the GPU compute power capabilities, provided that an efficient architecture-aware design is achieved. This dissertation presents a uniform design strategy for optimizing critical memory-bound kernels on GPUs. Based on hierarchical register blocking, double buffering and latency hiding techniques, this strategy leverages the performance of a wide range of standard numerical kernels found in dense and sparse linear algebra libraries. The work presented here focuses on matrix-vector multiplication kernels (MVM) as repre- sentative and most important memory-bound operations in this context. Each kernel inherits the benefits of the proposed strategies. By exposing a proper set of tuning parameters, the strategy is flexible enough to suit different types of matrices, ranging from large dense matrices, to sparse matrices with dense block structures, while high performance is maintained. Furthermore, the tuning parameters are used to maintain the relative performance across different GPU architectures. Multi-GPU acceleration is proposed to scale the performance on several devices. The performance experiments show improvements ranging from 10% and up to more than fourfold speedup against competitive GPU MVM approaches. Performance impacts on high-level numerical libraries and a computational astronomy application are highlighted, since such memory-bound kernels are often located in innermost levels of the software chain. The excellent performance obtained in this work has led to the adoption of code in NVIDIAs widely distributed cuBLAS library. GPU Computing Numerical Linear Algebra Memory-Bound Kernels Performance Optimization
32	Exploiting Data Sparsity in Matrix Algorithms for Adaptive Optics and Seismic Redatuming Hong, Yuxi 07 June 2023 (has links) This thesis addresses the exponential growth of experimental data and the resulting computational complexity seen in two major scientific applications, which account for significant cycles consumed on today’s supercomputers. The first application concerns computation of the adaptive optics system in next-generation ground-based telescopes, which will expand our knowledge of the universe but confronts the astronomy community with daunting real-time computation requirements. The second application deals with emerging frequency-domain redatuming methods, e.g., Marchenko redatuming, which are game-changers in exploration geophysics. They are valuable to oil and gas applications and will soon be to geothermal exploration and carbon capture storage. However, they are impractical at industrial scale due to prohibitive computational complexity and memory footprint. We tackle the aforementioned challenges by designing high-performance algebraic and stochastic algorithms, which exploit the data sparsity structure of the matrix operator. We show that popular randomized algorithms from machine learning can also solve large covariance matrix problems that capture the correlations of wavefront sensors detecting the atmospheric turbulence for ground-based telescopes. Algebraic compression based on low-rank approximations that retains the most significant portion of the spectrum of the operator provides numerical solutions at the accuracy level required by the application. In addition, selective use of lower precisions can further reduce the data volume by trading off application accuracy for memory footprint. Reducing memory footprint has ancillary implications for reduced energy expenditure and reduced execution time because moving a word is more expensive than computing with it on today’s architectures. We exploit the data sparsity of matrices representative of these two scientific applications and propose four algorithms to accelerate the corresponding computational workload. In soft real-time control of an adaptive optics system, we design a stochastic Levenberg-Marquardt method and high-performance solver for Discrete-time Algebraic Riccati Equations. We create a tile low-rank matrix-vector multiplication algorithm used in both hard real-time control of ground-based telescopes and seismic redatuming. Finally, we leverage multiple precisions to further improve the performance of seismic redatuming applications We implement our algorithms on essentially all families of currently relevant HPC architectures and customized AI accelerators and demonstrate significant performance improvement and validated numerical solutions. Numerical Linear Algebra TLR-MVM SLM Adaptive Optics Seismic Redatuming
33	Doubly-Invariant Subgroups for p=3 Wyles, Stacie Nicole 29 May 2015 (has links) No description available. Mathematics
34	Diagonal Estimation with Probing Methods Kaperick, Bryan James 21 June 2019 (has links) Probing methods for trace estimation of large, sparse matrices has been studied for several decades. In recent years, there has been some work to extend these techniques to instead estimate the diagonal entries of these systems directly. We extend some analysis of trace estimators to their corresponding diagonal estimators, propose a new class of deterministic diagonal estimators which are well-suited to parallel architectures along with heuristic arguments for the design choices in their construction, and conclude with numerical results on diagonal estimation and ordering problems, demonstrating the strengths of our newly-developed methods alongside existing methods. / Master of Science / In the past several decades, as computational resources increase, a recurring problem is that of estimating certain properties very large linear systems (matrices containing real or complex entries). One particularly important quantity is the trace of a matrix, defined as the sum of the entries along its diagonal. In this thesis, we explore a problem that has only recently been studied, in estimating the diagonal entries of a particular matrix explicitly. For these methods to be computationally more efficient than existing methods, and with favorable convergence properties, we require the matrix in question to have a majority of its entries be zero (the matrix is sparse), with the largest-magnitude entries clustered near and on its diagonal, and very large in size. In fact, this thesis focuses on a class of methods called probing methods, which are of particular efficiency when the matrix is not known explicitly, but rather can only be accessed through matrix vector multiplications with arbitrary vectors. Our contribution is new analysis of these diagonal probing methods which extends the heavily-studied trace estimation problem, new applications for which probing methods are a natural choice for diagonal estimation, and a new class of deterministic probing methods which have favorable properties for large parallel computing architectures which are becoming ever-more-necessary as problem sizes continue to increase beyond the scope of single processor architectures. Probing Methods Numerical Linear Algebra Computational Inverse Problems
35	Students' Conceptions of Normalization Watson, Kevin L. 13 October 2020 (has links) Improving the learning and success of students in undergraduate science, technology, engineering, and mathematics (STEM) courses has become an increased focus of education researchers within the past decade. As part of these efforts, discipline-based education research (DBER) has emerged within STEM education as a way to address discipline-specific challenges for teaching and learning, by combining expert knowledge of the various STEM disciplines with knowledge about teaching and learning (Dolan et al., 2018; National Research Council, 2012). Particularly important to furthering DBER and improving STEM education are interdisciplinary studies that examine how the teaching and learning of specific concepts develop among and across various STEM disciplines... / Ph. D. / Dissertation proposal normalization norm quantum physics linear algebra coordination class
36	Fast, Sparse Matrix Factorization and Matrix Algebra via Random Sampling for Integral Equation Formulations in Electromagnetics Wilkerson, Owen Tanner 01 January 2019 (has links) Many systems designed by electrical & computer engineers rely on electromagnetic (EM) signals to transmit, receive, and extract either information or energy. In many cases, these systems are large and complex. Their accurate, cost-effective design requires high-fidelity computer modeling of the underlying EM field/material interaction problem in order to find a design with acceptable system performance. This modeling is accomplished by projecting the governing Maxwell equations onto finite dimensional subspaces, which results in a large matrix equation representation (Zx = b) of the EM problem. In the case of integral equation-based formulations of EM problems, the M-by-N system matrix, Z, is generally dense. For this reason, when treating large problems, it is necessary to use compression methods to store and manipulate Z. One such sparse representation is provided by so-called H^2 matrices. At low-to-moderate frequencies, H^2 matrices provide a controllably accurate data-sparse representation of Z. The scale at which problems in EM are considered ``large'' is continuously being redefined to be larger. This growth of problem scale is not only happening in EM, but respectively across all other sub-fields of computational science as well. The pursuit of increasingly large problems is unwavering in all these sub-fields, and this drive has long outpaced the rate of advancements in processing and storage capabilities in computing. This has caused computational science communities to now face the computational limitations of standard linear algebraic methods that have been relied upon for decades to run quickly and efficiently on modern computing hardware. This common set of algorithms can only produce reliable results quickly and efficiently for small to mid-sized matrices that fit into the memory of the host computer. Therefore, the drive to pursue larger problems has even began to outpace the reasonable capabilities of these common numerical algorithms; the deterministic numerical linear algebra algorithms that have gotten matrix computation this far have proven to be inadequate for many problems of current interest. This has computational science communities focusing on improvements in their mathematical and software approaches in order to push further advancement. Randomized numerical linear algebra (RandNLA) is an emerging area that both academia and industry believe to be strong candidates to assist in overcoming the limitations faced when solving massive and computationally expensive problems. This thesis presents results of recent work that uses a random sampling method (RSM) to implement algebraic operations involving multiple H^2 matrices. Significantly, this work is done in a manner that is non-invasive to an existing H^2 code base for filling and factoring H^2 matrices. The work presented thus expands the existing code's capabilities with minimal impact on existing (and well-tested) applications. In addition to this work with randomized H^2 algebra, improvements in sparse factorization methods for the compressed H^2 data structure will also be presented. The reported developments in filling and factoring H^2 data structures assist in, and allow for, the further pursuit of large and complex problems in computational EM (CEM) within simulation code bases that utilize the H^2 data structure. Numerical Simulations Randomized Numerical Linear Algebra Computational Electromagnetics Computational Linear Algebra Computational Engineering Electrical and Computer Engineering Electromagnetics and Photonics
37	Hypergraph Capacity with Applications to Matrix Multiplication Peebles, John Lee Thompson, Jr. 01 May 2013 (has links) The capacity of a directed hypergraph is a particular numerical quantity associated with a hypergraph. It is of interest because of certain important connections to longstanding conjectures in theoretical computer science related to fast matrix multiplication and perfect hashing as well as various longstanding conjectures in extremal combinatorics. We give an overview of the concept of the capacity of a hypergraph and survey a few basic results regarding this quantity. Furthermore, we discuss the Lovász number of an undirected graph, which is known to upper bound the capacity of the graph (and in practice appears to be the best such general purpose bound). We then elaborate on some attempted generalizations/modifications of the Lovász number to undirected hypergraphs that we have tried. It is not currently known whether these attempted generalizations/modifications upper bound the capacity of arbitrary hypergraphs. An important method for proving lower bounds on hypergraph capacity is to exhibit a large independent set in a strong power of the hypergraph. We examine methods for this and show a barrier to attempts to usefully generalize certain of these methods to hypergraphs. We then look at cap sets: independent sets in powers of a certain hypergraph. We examine certain structural properties of them with the hope of finding ones that allow us to prove upper bounds on their size. Finally, we consider two interesting generalizations of capacity and use one of them to formulate several conjectures about connections between cap sets and sunflower-free sets. 05C69 independent sets 05C35 Extremal problems 05C50 Graphs and linear algebra 05C65 Hypergraphs Discrete Mathematics and Combinatorics
38	Structured Matrices and the Algebra of Displacement Operators Takahashi, Ryan 01 May 2013 (has links) Matrix calculations underlie countless problems in science, mathematics, and engineering. When the involved matrices are highly structured, displacement operators can be used to accelerate fundamental operations such as matrix-vector multiplication. In this thesis, we provide an introduction to the theory of displacement operators and study the interplay between displacement and natural matrix constructions involving direct sums, Kronecker products, and blocking. We also investigate the algebraic behavior of displacement operators, developing results about invertibility and kernels. 15A23 Factorization of matrices 65F99 Numerical linear algebra Algebra Other Applied Mathematics
39	An attempt to represent geometrically the imaginary of algebra Tobias, Ruth K. January 1987 (has links) In 1981 the author submitted that "many of the (then) more recent school syllabuses remain disjointed and give expression still to a school mathematics course as step-by-step progression through a list of disparate topics". The position has not changed. It is not yet generally accepted that there can no longer be an accepted body of mathematical knowledge that needs to be taught. The rapid development of new technology and the introduction of the microcomputer should enable the 'modern' mathematics of the early 1960's to enhance the mathematical experiences of pupils in a practical and comprehensible way and prompt a new style of teaching and learning mathematics. There is, however, a fundamental core of mathematics which must inevitably find a place in the school mathematics curriculum. In Part I of the thesis the emphasis is on a method of presentation of certain key topics which illustrate the basic pattern of a group structure. Former complications at school level of putting plane geometry on a logical footing have to be avoided. The use of complex numbers highlights significant and sometimes rather difficult geometrical ideas. In Part 11 the author attempts to show how some of these ideas may be presented to extend the basic pattern to that of linear algebra. The work culminates in Part III with the use of linear complex algebra to present more vividly the symmetries of the Platonic solids. The author anticipates the realistic presentation of the aesthetic side of 3-dimensional geometry and takes a look at its possible presentation through the medium of the microcomputer. At this early stage of the development of the ideas to be discussed, there can be no formal testing of the results by quantitative analysis. Evaluation of the viability of the proposals will be qualitative and the comments of 'critical academic friends' will be included. The originality demanded of a piece of research goes beyond the exposition. Here it will consist of new insights into ideas appropriate to senior pupils in schools and a rewriting of existing material often thought to be beyond their scope. The work is supported by suggested lesson sequences, transcripts of recorded presentations, and examples of students' work. Subsequent development must face the question of assessment and evaluation at sixth-form level of the proposed new style of teaching mathematics. The author makes some suggestions in the concluding chapter. 512
40	Lie isomorphisms of triangular and block-triangular matrix algebras over commutative rings Cecil, Anthony John 24 August 2016 (has links) For many matrix algebras, every associative automorphism is inner. We discuss results by Đoković that a non-associative Lie automorphism φ of a triangular matrix algebra Tₙ over a connected unital commutative ring, is of the form φ(A)=SAS⁻¹ + τ(A)I or φ(A)=−SJ Aᵀ JS⁻¹ + τ(A)I, where S ∈ Tₙ is invertible, J is an antidiagonal permutation matrix, and τ is a generalized trace. We incorporate additional arguments by Cao that extended Đoković’s result to unital commutative rings containing nontrivial idempotents. Following this we develop new results for Lie isomorphisms of block upper-triangular matrix algebras over unique factorization domains. We build on an approach used by Marcoux and Sourour to characterize Lie isomorphisms of nest algebras over separable Hilbert spaces. We find that these Lie isomorphisms generally follow the form φ = σ + τ where σ is either an associative isomorphism or the negative of an associative anti-isomorphism, and τ is an additive mapping into the center, which maps commutators to zero. This echoes established results by Martindale for simple and prime rings. / Graduate Mathematics Linear Algebra Matrix Algebras Ring Theory Lie Isomorphisms Triangular Algebras Block-Triangular Algebras

Search results