• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 4
  • Tagged with
  • 4
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Performance Analysis of Hardware/Software Co-Design of Matrix Solvers

Huang, Peng 28 November 2008
Solving a system of linear and nonlinear equations lies at the heart of many scientific and engineering applications such as circuit simulation, applications in electric power networks, and structural analysis. The exponentially increasing complexity of these computing applications and the high cost of supercomputing force us to explore affordable high performance computing platforms. The ultimate goal of this research is to develop hardware friendly parallel processing algorithms and build cost effective high performance parallel systems using hardware in order to enable the solution of large linear systems. In this thesis, FPGA-based general hardware architectures of selected iterative methods and direct methods are discussed. Xilinx Embedded Development Kit (EDK) hardware/software (HW/SW) codesigns of these methods are also presented. For iterative methods, FPGA based hardware architectures of Jacobi, combined Jacobi and Gauss-Seidel, and conjugate gradient (CG) are proposed. The convergence analysis of the LNS-based Jacobi processor demonstrates to what extent the hardware resource constraints and additional conversion error affect the convergence of Jacobi iterative method. Matlab simulations were performed to compare the performance of three iterative methods in three ways, i.e., number of iterations for any given tolerance, number of iterations for different matrix sizes, and computation time for different matrix sizes. The simulation results indicate that the key to a fast implementation of the three methods is a fast implementation of matrix multiplication. The simulation results also show that CG method takes less number of iterations for any given tolerance, but more computation time as matrix size increases compared to other two methods, since matrix-vector multiplication is a more dominant factor in CG method than in the other two methods. By implementing matrix multiplications of the three methods in hardware with Xilinx EDK HW/SW codesign, the performance is significantly improved over pure software Power PC (PPC) based implementation. The EDK implementation results show that CG takes less computation time for any size of matrices compared to other two methods in HW/SW codesign, due to that fact that matrix multiplications dominate the computation time of all three methods while CG requires less number of iterations to converge compared to other two methods.<p> For direct methods, FPGA-based general hardware architecture and Xilinx EDK HW/SW codesign of WZ factorization are presented. Single unit and scalable hardware architectures of WZ factorization are proposed and analyzed under different constraints. The results of Matlab simulations show that WZ runs faster than the LU on parallel processors but slower on a single processor. The simulation results also indicate that the most time consuming part of WZ factorization is matrix update. By implementing the matrix update of WZ factorization in hardware with Xilinx EDK HW/SW codesign, the performance is also apparently improved over PPC based pure software implementation.
2

Performance Analysis of Hardware/Software Co-Design of Matrix Solvers

Huang, Peng 28 November 2008 (has links)
Solving a system of linear and nonlinear equations lies at the heart of many scientific and engineering applications such as circuit simulation, applications in electric power networks, and structural analysis. The exponentially increasing complexity of these computing applications and the high cost of supercomputing force us to explore affordable high performance computing platforms. The ultimate goal of this research is to develop hardware friendly parallel processing algorithms and build cost effective high performance parallel systems using hardware in order to enable the solution of large linear systems. In this thesis, FPGA-based general hardware architectures of selected iterative methods and direct methods are discussed. Xilinx Embedded Development Kit (EDK) hardware/software (HW/SW) codesigns of these methods are also presented. For iterative methods, FPGA based hardware architectures of Jacobi, combined Jacobi and Gauss-Seidel, and conjugate gradient (CG) are proposed. The convergence analysis of the LNS-based Jacobi processor demonstrates to what extent the hardware resource constraints and additional conversion error affect the convergence of Jacobi iterative method. Matlab simulations were performed to compare the performance of three iterative methods in three ways, i.e., number of iterations for any given tolerance, number of iterations for different matrix sizes, and computation time for different matrix sizes. The simulation results indicate that the key to a fast implementation of the three methods is a fast implementation of matrix multiplication. The simulation results also show that CG method takes less number of iterations for any given tolerance, but more computation time as matrix size increases compared to other two methods, since matrix-vector multiplication is a more dominant factor in CG method than in the other two methods. By implementing matrix multiplications of the three methods in hardware with Xilinx EDK HW/SW codesign, the performance is significantly improved over pure software Power PC (PPC) based implementation. The EDK implementation results show that CG takes less computation time for any size of matrices compared to other two methods in HW/SW codesign, due to that fact that matrix multiplications dominate the computation time of all three methods while CG requires less number of iterations to converge compared to other two methods.<p> For direct methods, FPGA-based general hardware architecture and Xilinx EDK HW/SW codesign of WZ factorization are presented. Single unit and scalable hardware architectures of WZ factorization are proposed and analyzed under different constraints. The results of Matlab simulations show that WZ runs faster than the LU on parallel processors but slower on a single processor. The simulation results also indicate that the most time consuming part of WZ factorization is matrix update. By implementing the matrix update of WZ factorization in hardware with Xilinx EDK HW/SW codesign, the performance is also apparently improved over PPC based pure software implementation.
3

Channel coding application for cdma2000 implemented in a FPGA with a Soft processor core

Kling, Mikael January 2005 (has links)
<p>With today’s FPGA’s it’s possible to implement complete systems in a single FPGA. With help of Soft Processor Cores like the MicroBlaze processor several microcontrollers can be implemented in the same FPGA.</p><p>The third generation telecommunications system, cdma2000, has several channels, which has specific assignments. The Sync channel purpose is to attain initial time synchronization.</p><p>The purpose with this thesis has been to implement the Sync channel in a FPGA with use of a MicroBlaze processor. An evaluation of the concept of using a Soft Processor Core instead of ordinary DSP’s and microcontrollers would then be conducted.</p><p>This thesis has resulted in a system with a MicroBlaze processor that has the Sync channel as a peripheral. It’s possible to write information via HyperTerminal to the MicroBlaze processor which then uses this data as input to the Sync channel. The Sync channel then modulates the data according to the cdma2000 specifications and then outputs it onto an external pin at the FPGA.</p><p>The evaluation of this concept hasn’t resulted in a general recommendation whether to use ASIC or FPGA’s in a system. The concept of using Soft Processor Cores certainly has its benefits and is something that could be thought of in the future when designing a system.</p>
4

Channel coding application for cdma2000 implemented in a FPGA with a Soft processor core

Kling, Mikael January 2005 (has links)
With today’s FPGA’s it’s possible to implement complete systems in a single FPGA. With help of Soft Processor Cores like the MicroBlaze processor several microcontrollers can be implemented in the same FPGA. The third generation telecommunications system, cdma2000, has several channels, which has specific assignments. The Sync channel purpose is to attain initial time synchronization. The purpose with this thesis has been to implement the Sync channel in a FPGA with use of a MicroBlaze processor. An evaluation of the concept of using a Soft Processor Core instead of ordinary DSP’s and microcontrollers would then be conducted. This thesis has resulted in a system with a MicroBlaze processor that has the Sync channel as a peripheral. It’s possible to write information via HyperTerminal to the MicroBlaze processor which then uses this data as input to the Sync channel. The Sync channel then modulates the data according to the cdma2000 specifications and then outputs it onto an external pin at the FPGA. The evaluation of this concept hasn’t resulted in a general recommendation whether to use ASIC or FPGA’s in a system. The concept of using Soft Processor Cores certainly has its benefits and is something that could be thought of in the future when designing a system.

Page generated in 0.0151 seconds