Global ETD Search

31	Tensor rank and support rank in the context of algebraic complexity theory / Tensorrang och stödrang inom algebraisk komplexitetsteori Andersson, Pelle January 2023 (has links) Starting with the work of Volker Strassen, algorithms for matrix multiplication have been developed which are time complexity-wise more efficient than the standard algorithm from the definition of multiplication. The general method of the developments has been viewing the bilinear mapping that matrix multiplication is as a three-dimensional tensor, where there is an exact correspondence between time complexity of the multiplication algorithm and tensor rank. The latter can be seen as a generalisation of matrix rank, being the minimum number of terms a tensor can be decomposed as. However, in contrast to matrix rank there is no general method of computing tensor ranks, with many values being unknown for important three-dimensional tensors. To further improve the theoretical bounds of the time complexity of matrix multiplication, support rank of tensors has been introduced, which is the lowest rank of tensors with the same support in some basis. The goal of this master's thesis has been to go through the history of faster matrix multiplication, as well as specifically examining the properties of support rank for general tensors. In regards to the latter, a complete classification of rank structures of support classes is made for the smallest non-degenerate tensor product space in three dimensions. From this, the size of a support can be seen affecting the pool of possible ranks within a support class. At the same time, there is in general no symmetry with regards to support size occurring in the rank structures of the support classes, despite there existing a symmetry and bijection between mirrored supports. Discussions about how to classify support rank structures for larger tensor product spaces are also included. / Från och med forskning gjord av Volker Strassen har flera algoritmer för matrismultiplikation utvecklats som är effektivare visavi tidskomplexitet än standardalgoritmen som utgår från defintionen av multiplikation. Generellt sett har metoden varit att se den bilinjära avbildningen som matrismultiplikation är som en tredimensionell tensor. Där används att det finns en exakt korrespondens mellan multiplikationsalgoritmens tidskomplexitet och tensorrang. Det sistnämnda är ett slags generalisering av matrisrang, och är minsta antalet termer en tensor kan skrivas som. Till skillnad frpn matrisrang finns ingen allmän metod för att beräkna tensorrang, och många värden är okända även för välstuderade och viktiga tensorer. För att hitta fler övre begränsningar på matrismultiplikations tidskomplexitet har stödrang av tensorer införts, som är den lägsta rangen bland tensor med samma stöd i en viss bas. Målet med detta examensarbete har varit att göra en genomgång av historien om snabbare matrismultiplikation, samt att specifikt undersöka egenskaper av stödrang för allmänna tredimensionella tensorer. För det sistnämnda görs en fullständig klassificering av rangstrukturer bland stödklasser för den minsta icke-degenererade tensorprodukten av tre vektorrum. Slutsatser är bl.a. att storleken av ett stöd kan ses påverka antalet möjliga ranger inom en stödklass. Samtidigt finns i allmänhet ingen symmetri med avseende på stödstorlek i stödklassernas rangstrukturer. Detta trots att det finns en symmetri och bijektion mellan speglade stöd. I arbetet ingår även en diskussion om hur stödrangstrukturer skulle kunna klassificeras för större tensorprodukter. linear algebra tensor product tensor rank matrix multiplication complexity linjär algebra tensorprodukt tensorrang matrismultiplikation komplexitet Other Mathematics Annan matematik
32	Combining Shortest Paths, Bottleneck Paths and Matrix Multiplication Shinn, Tong-Wook January 2014 (has links) We provide a formal mathematical definition of the Shortest Paths for All Flows (SP-AF) problem and provide many efficient algorithms. The SP-AF problem combines the well known Shortest Paths (SP) and Bottleneck Paths (BP) problems, and can be solved by utilising matrix multiplication. Thus in our research of the SP-AF problem, we also make a series of contributions to the underlying topics of the SP problem, the BP problem, and matrix multiplication. For the topic of matrix multiplication we show that on an n-by-n two dimensional (2D) square mesh array, two n-by-n matrices can be multiplied in exactly 1.5n ‒ 1 communication steps. This halves the number of communication steps required by the well known Cannon’s algorithm that runs on the same sized mesh array. We provide two contributions for the SP problem. Firstly, we enhance the breakthrough algorithm by Alon, Galil and Margalit (AGM), which was the first algorithm to achieve a deeply sub-cubic time bound for solving the All Pairs Shortest Paths (APSP) problem on dense directed graphs. Our enhancement allows the algorithm by AGM to remain sub-cubic for larger upper bounds on integer edge costs. Secondly, we show that for graphs with n vertices, the APSP problem can be solved in exactly 3n ‒ 2 communication steps on an n-by-n 2D square mesh array. This improves on the previous result of 3.5n communication steps achieved by Takaoka and Umehara. For the BP problem, we show that we can compute the bottleneck of the entire graph without solving the All Pairs Bottleneck Paths (APBP) problem, resulting in a much more efficient time bound. Finally we define an algebraic structure called the distance/flow semi-ring to formally introduce the SP-AF problem, and we provide many algorithms for solving the Single Source SP-AF (SSSP-AF) problem and the All Pairs SP-AF (APSP-AF) problem. For the APSP-AF problem, algebraic algorithms are given that utilise faster matrix multiplication over a ring. Graph Theory Graph Paths Shortest Paths SP APSP Bottleneck Paths BP APBP Matrix Multiplication Shortest Paths for All Flows SP-AF
33	Algorithm/architecture codesign of low power and high performance linear algebra compute fabrics Pedram, Ardavan 27 September 2013 (has links) In the past, we could rely on technology scaling and new micro-architectural techniques to improve the performance of processors. Nowadays, both of these methods are reaching their limits. The primary concern in future architectures with billions of transistors on a chip and limited power budgets is power/energy efficiency. Full-custom design of application-specific cores can yield up to two orders of magnitude better power efficiency over conventional general-purpose cores. However, a tremendous design effort is required in integrating a new accelerator for each new application. In this dissertation, we present the design of specialized compute fabrics that maintain the efficiency of full custom hardware while providing enough flexibility to execute a whole class of coarse-grain operations. The broad vision is to develop integrated and specialized hardware/software solutions that are co-optimized and co-designed across all layers ranging from the basic hardware foundations all the way to the application programming support through standard linear algebra libraries. We try to address these issues specifically in the context of dense linear algebra applications. In the process, we pursue the main questions that architects will face while designing such accelerators. How broad is this class of applications that the accelerator can support? What are the limiting factors that prevent utilization of these accelerators on the chip? What is the maximum achievable performance/efficiency? Answering these questions requires expertise and careful codesign of the algorithms and the architecture to select the best possible components, datapaths, and data movement patterns resulting in a more efficient hardware-software codesign. In some cases, codesign reduces complexities that are imposed on the algorithm side due to the initial limitations in the architectures. We design a specialized Linear Algebra Processor (LAP) architecture and discuss the details of mapping of matrix-matrix multiplication onto it. We further verify the flexibility of our design for computing a broad class of linear algebra kernels. We conclude that this architecture can perform a broad range of matrix-matrix operations as complex as matrix factorizations, and even Fast Fourier Transforms (FFTs), while maintaining its ASIC level efficiency. We present a power-performance model that compares state-of-the-art CPUs and GPUs with our design. Our power-performance model reveals sources of inefficiencies in CPUs and GPUs. We demonstrate how to overcome such inefficiencies in the process of designing our LAP. As we progress through this dissertation, we introduce modifications of the original matrix-matrix multiplication engine to facilitate the mapping of more complex operations. We observe the resulting performance and efficiencies on the modified engine using our power estimation methodology. When compared to other conventional architectures for linear algebra applications and FFT, our LAP is over an order of magnitude better in terms of power efficiency. Based on our estimations, up to 55 and 25 GFLOPS/W single- and double-precision efficiencies are achievable on a single chip in standard 45nm technology. / text Low-power design Energy-aware systems Performance analysis and design aids Matrix multiplication Memory hierarchy Level-3 BLAS Special-purpose hardware Matrix factorization Fast Fourier transform
34	Utilização de matrizes no estudo de orientação e posição de um braço robótico por meio das coordenadas de Denavit-Hartenberg. / Use of matrices in the study of orientation and position of a robotic arm by Denavit-Hartenberg coordinates Costa, Carlos Gomides da 08 August 2014 (has links) Submitted by Luciana Ferreira (lucgeral@gmail.com) on 2015-01-27T14:34:02Z No. of bitstreams: 2 license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5) Dissertação - Carlos Gomides da Costa - 2014.pdf: 5662268 bytes, checksum: 9c271f8544a1ec02f89c341d68f33801 (MD5) / Approved for entry into archive by Luciana Ferreira (lucgeral@gmail.com) on 2015-01-28T12:36:28Z (GMT) No. of bitstreams: 2 license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5) Dissertação - Carlos Gomides da Costa - 2014.pdf: 5662268 bytes, checksum: 9c271f8544a1ec02f89c341d68f33801 (MD5) / Made available in DSpace on 2015-01-28T12:36:28Z (GMT). No. of bitstreams: 2 license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5) Dissertação - Carlos Gomides da Costa - 2014.pdf: 5662268 bytes, checksum: 9c271f8544a1ec02f89c341d68f33801 (MD5) Previous issue date: 2014-08-08 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES / This work have as the main objective present one a different way to teach the subject matrix multiplication, a way in a ludic form, applied, that can transform the teaching more pleasureable and motivate for the high scholl students, because for many this issue has no practical applications, which for many reasons is the lack of interest in learning such content. The tool used to attract as students so as teachers was the utilization of Kit LEGO® Mindstorms NXT 2.0 for the construction of the robotic arm, or robotic manipulator, that which for this propose, has three rotational joints. The LEGO kit was choosen due its easy interaction with children and adolescents, and to encourage the construction of knowledge stimulating the solution of problems that may arise during the process of building robotic arm.The application of content takes place in obtaining the four parameters Denavit-Hartenberg and those obtained after the placement of reference systems, where three of these parameters are constants obtained by measurement at robotic arm and the fourth parameter is variable dependent on the intended movement or the final position which is to be determined. / O presente trabalho tem como principal objetivo apresentar mais uma opção de ensinar o assunto multiplicação de matrizes, de uma forma mais lúdica, aplicada, que pode tornar o ensino mais prazeroso e atrativo para alunos do ensino médio, pois para muitos esse assunto não tem aplicações práticas, o que para outros é motivo da falta de interesse em aprender tal conteúdo. A ferramenta apresentada com o intuito de atrair tanto alunos quanto professores, foi o Kit LEGO® Mindstorms NXT 2.0, que possibilitou a construção do braço robótico ou manipulador robótico, que nesse trabalho foi apresentado com três juntas do tipo rotacionais. A escolha desse Kit LEGO® é justificada pela sua facilidade de interação com crianças e adolescentes, além de estimular a construção do aprendizado pois, estimula a solução de problemas que possam aparecer durante o processo de construção do braço robótico. A aplicação do conteúdo se dá na obtenção dos quatro parâmetros de Denavit-Hartenberg que são obtidos após a colocação dos sistemas de referência, onde três desses parâmetros são definidos a partir de medições feitas no braço robótico e o quarto parâmetro é variável dependente do movimento pretendido ou a posição final que se queira determinar. Multiplicação de matrizes LEGO® Mindstorms Denavit-Hartenberg Braço robótico Juntas rotacionais Matrix multiplication LEGO® Mindstorms Denavit e Hartenberg Robotic arm Rotational joints CIENCIAS EXATAS E DA TERRA::MATEMATICA
35	The Kronecker Product Broxson, Bobbi Jo 01 January 2006 (has links) This paper presents a detailed discussion of the Kronecker product of matrices. It begins with the definition and some basic properties of the Kronecker product. Statements will be proven that reveal information concerning the eigenvalues, singular values, rank, trace, and determinant of the Kronecker product of two matrices. The Kronecker product will then be employed to solve linear matrix equations. An investigation of the commutativity of the Kronecker product will be carried out using permutation matrices. The Jordan - Canonical form of a Kronecker product will be examined. Variations such as the Kronecker sum and generalized Kronecker product will be introduced. The paper concludes with an application of the Kronecker product to large least squares approximations. University of North Florida UNF Mathematics Kronecker products Jordan - Canonical Form matrix multiplication linear matrix equations large least squares problems Mathematics
36	AI Based Methods for Matrix Multiplication in High Resolution Simulations of Radio Access Networks / AI Baserade Metoder för Matris Multiplikationer för högupplösta simuleringar av Radionätverk Johnson, Marcus, Forslund, Herman January 2023 (has links) The increasing demand for mobile data has placed significant strain on radio access networks (RANs), leading to a continuous need for increased network capacity. In keeping with that, a significant advancement in modern RANs is the ability to utilize several receivers and transmitters, to allow for beamforming. One way to increase the capacity of the network is therefore to optimize the resource allocation by preprocessing the transmitted signals, which involves several costly matrix multiplications (MMs). The aim of the project was to investigate the potential of accelerating Ericsson's RAN simulations by using AI based approximate matrix multiplication (AMM) algorithms. The main focus was on the multiply additionless (MADDNESS) algorithm, a product quantization technique that has achieved speedups of up to 100 times compared to exact MM, and 10 times faster than previous AMM methods. A complex matrix handling version of MADDNESS was implemented in Java and Python respectively, and its speed and accuracy were evaluated against Ericsson's current MM implementation. The proposed implementation did not beat the benchmark with respect to speed, instead resulting in a 4-10 times slowdown in runtime. However, this may largely be due to the fact that the used languages do not allow for complete control over memory resource allocation. As such, the implementations at hand do not incorporate all the crucial features of the algorithm. Particularly, the handicapped version does not fully leverage the vectorization potential, which is one of the key contributors to the speed of the algorithm. Consequently, further improvements are necessary before employing the techniques in an end-to-end implementation. / Den växande efterfrågan på mobildata har ökat belastningen på dagens radionätverk (RAN) och har medfört ett behov av att utvidga dess kapacitet. En betydande innovation inom RAN är beamforming, vilket är förmågan att fokusera digitala signaler mot mottagaren och på så vis öka singalstyrkan. En metod för att öka kapaciteten i ett nätverk är att optimera både kvaliteten av och resursallokeringen mellan nätverkets digitala kanaler, vilket medför tidskrävande matrismultiplikationer. Syftet med denna studie var att utforska om AI-baserade approximativa matrismultiplikationsalgoritmer har potentialen att accelerera Ericssons digitala tvilling-simuleringar. Studien fokuserade i huvudsak på produktkvantiseringsalgoritmen MADDNESS som påvisat potentialen att accelerera exakta matrismultiplikationer med en faktor 100, samt en faktor 10 snabbare än jämförbara approximativa metoder. En modifierad version av MADDNESS, som behandlar komplexa matriser, implementerades i Java samt Python, varefter precisionen och hastigheten utvärderades. Den föreslagna implementationen resulterade i en försämring med avseende på hastigheten med en faktor 4-10 jämfört med Ericssons nuvarande algoritmer. Den föreslagna implementationen saknar effektiv minnesallokering och misslyckas följaktligen att till fullo ta tillvara på vektoriseringspotentialen i MADDNESS. Detta indikerar att det är nödvändigt för ytterligare förbättringar innan algoritmen är användbar i den givna simuleringsmiljön. Product-Quantization MADDNESS Radio Access Networks Channel Estimation MIMO Approximate Matrix Multiplication Pruduktkvantisering MADDNESS RAN MIMO Approximativa matrismultiplikation Other Mathematics Annan matematik
37	Algebraic and multilinear-algebraic techniques for fast matrix multiplication Gouaya, Guy Mathias January 2015 (has links) This dissertation reviews the theory of fast matrix multiplication from a multilinear-algebraic point of view, as well as recent fast matrix multiplication algorithms based on discrete Fourier transforms over nite groups. To this end, the algebraic approach is described in terms of group algebras over groups satisfying the triple product Property, and the construction of such groups via uniquely solvable puzzles. The higher order singular value decomposition is an important decomposition of tensors that retains some of the properties of the singular value decomposition of matrices. However, we have proven a novel negative result which demonstrates that the higher order singular value decomposition yields a matrix multiplication algorithm that is no better than the standard algorithm. / Mathematical Sciences / M. Sc. (Applied Mathematics) Matrix multiplication Multilinear algebra Discrete Fourier transform Tensor rank Triple product property Strassen algorithm Unique solvable puzzles Computer algebra 512.5 Multiplication, Complex Multilinear algebra Computer algorithms Multilinear algebra
38	Power and Energy Efficiency Evaluation for HW and SW Implementation of nxn Matrix Multiplication on Altera FPGAs Renbi, Abdelghani January 2009 (has links) <p>In addition to the performance, low power design became an important issue in the design process of mobile embedded systems. Mobile electronics with rich features most often involve complex computation and intensive processing, which result in short battery lifetime and particularly when low power design is not taken in consideration. In addition to mobile computers, thermal design is also calling for low power techniques to avoid components overheat especially with VLSI technology. Low power design has traced a new era. In this thesis we examined several techniques to achieve low power design for FPGAs, ASICs and Processors where ASICs were more flexible to exploit the HW oriented techniques for low power consumption. We surveyed several power estimation methodologies where all of them were prone to at least one disadvantage. We also compared and analyzed the power and energy consumption in three different designs, which perform matrix multiplication within Altera platform and using state-of-the-art FPGA device. We concluded that NIOS II\e is not an energy efficient alternative to multiply nxn matrices compared to HW matrix multipliers on FPGAs and configware is an enormous potential to reduce the energy consumption costs.</p> Low Power Design Techniques Energy Efficiency FPGA ASIC SoC NIOS CMOS Power Estimation Latency Matrix Multiplication Configware Reconfigurable Computing RISC Electronics Elektronik Computer and systems science Data- och systemvetenskap Systems engineering Systemteknik Computer engineering Datorteknik Data processing Databehandling
39	Multiplication matricielle efficace et conception logicielle pour la bibliothèque de calcul exact LinBox / Efficient matrix multiplication and design for the exact linear algebra library LinBox Boyer, Brice 21 June 2012 (has links) Dans ce mémoire de thèse, nous développons d'abord des multiplications matricielles efficaces. Nous créons de nouveaux ordonnancements qui permettent de réduire la taille de la mémoire supplémentaire nécessaire lors d'une multiplication du type Winograd tout en gardant une bonne complexité, grâce au développement d'outils externes ad hoc (jeu de galets), à des calculs fins de complexité et à de nouveaux algorithmes hybrides. Nous utilisons ensuite des technologies parallèles (multicœurs et GPU) pour accélérer efficacement la multiplication entre matrice creuse et vecteur dense (SpMV), essentielles aux algorithmes dits /boîte noire/, et créons de nouveaux formats hybrides adéquats. Enfin, nous établissons des méthodes de /design/ générique orientées vers l'efficacité, notamment par conception par briques de base, et via des auto-optimisations. Nous proposons aussi des méthodes pour améliorer et standardiser la qualité du code de manière à pérenniser et rendre plus robuste le code produit. Cela permet de pérenniser de rendre plus robuste le code produit. Ces méthodes sont appliquées en particulier à la bibliothèque de calcul exact LinBox. / We first expose in this memoir efficient matrix multiplication techniques. We set up new schedules that allow us to minimize the extra memory requirements during a Winograd-style matrix multiplication, while keeping the complexity competitive. In order to get them, we develop external tools (pebble game), tight complexity computations and new hybrid algorithms. Then we use parallel technologies (multicore CPU and GPU) in order to accelerate efficiently the sparse matrix--dense vector multiplication (SpMV), crucial to /blackbox/ algorithms and we set up new hybrid formats to store them. Finally, we establish generic design methods focusing on efficiency, especially via building block conceptions or self-optimization. We also propose tools for improving and standardizing code quality in order to make it more sustainable and more robust. This is in particular applied to the LinBox computer algebra library. Algèbre linéaire exacte Bibliothèque mathématique générique Multiplication matricielle dense/SpMV Matrice dense/creuse Ordonnancements/jeu de galet Patrons de conception Exact linear algebra Generic mathematic library Dense matrix multiplication/SpMV Sparse/dense matrix Schedulings/pebble games Design patterns
40	Algebraic and multilinear-algebraic techniques for fast matrix multiplication Gouaya, Guy Mathias January 2015 (has links) This dissertation reviews the theory of fast matrix multiplication from a multilinear-algebraic point of view, as well as recent fast matrix multiplication algorithms based on discrete Fourier transforms over nite groups. To this end, the algebraic approach is described in terms of group algebras over groups satisfying the triple product Property, and the construction of such groups via uniquely solvable puzzles. The higher order singular value decomposition is an important decomposition of tensors that retains some of the properties of the singular value decomposition of matrices. However, we have proven a novel negative result which demonstrates that the higher order singular value decomposition yields a matrix multiplication algorithm that is no better than the standard algorithm. / Mathematical Sciences / M. Sc. (Applied Mathematics) Matrix multiplication Multilinear algebra Discrete Fourier transform Tensor rank Triple product property Strassen algorithm Unique solvable puzzles Computer algebra 512.5 Multiplication, Complex Multilinear algebra Computer algorithms Multilinear algebra

Search results