• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Accelerator for Flexible QR Decomposition and Back Substitution

January 2020 (has links)
abstract: QR decomposition (QRD) of a matrix is one of the most common linear algebra operationsused for the decomposition of a square/non-square matrix. It has a wide range of applications especially in Multiple Input-Multiple Output (MIMO) communication systems. Unfortunately it has high computation complexity { for matrix size of nxn, QRD has O(n3) complexity and back substitution, which is used to solve a system of linear equations, has O(n2) complexity. Thus, as the matrix size increases, the hardware resource requirement for QRD and back substitution increases signicantly. This thesis presents the design and implementation of a exible QRD and back substitution accelerator using a folded architecture. It can support matrix sizes of 4x4, 8x8, 12x12, 16x16, and 20x20 with low hardware resource requirement. The proposed architecture is based on the systolic array implementation of the Givens algorithm for QRD. It is built with three dierent types of computation blocks which are connected in a 2-D array structure. These blocks are controlled by a scheduler which facilitates reusability of the blocks to perform computation for any input matrix size which is a multiple of 4. These blocks are designed using two basic programming elements which support both the forward and backward paths to compute matrix R in QRD and column-matrix X in back substitution computation. The proposed architecture has been mapped to Xilinx Zynq Ultrascale+ FPGA (Field Programmable Gate Array), ZCU102. All inputs are complex with precision of 40 bits (38 fractional bits and 1 signed bit). The architecture can be clocked at 50 MHz. The synthesis results of the folded architecture for dierent matrix sizes are presented. The results show that the folded architecture can support QRD and back substitution for inputs of large sizes which otherwise cannot t on an FPGA when implemented using a at architecture. The memory sizes required for dierent matrix sizes are also presented. / Dissertation/Thesis / Masters Thesis Electrical Engineering 2020

Page generated in 0.0528 seconds