Global ETD Search

Return to search

High Performance Soft Processor Architectures for Applications with Irregular Data- and Instruction-level Parallelism

Embedded systems based on FPGAs frequently incorporate soft processors. The prevalence of soft processors in embedded systems is due to their flexibility and adaptability to the application. However, soft processors provide moderate performance compared to hard cores and custom logic, hence faster performing soft processors are desirable.

Many soft processor architectures have been studied in the past including Vector processors and VLIWs. These architectures focus on regular applications in which it is possible to extract data and/or instruction level parallelism offline. However, applications with irregular parallelism only benefit marginally from such architectures. Targeting such applications, we investigate superscalar, out-of-order, and Runahead execution on FPGAs. Although these architectures have been investigated in the ASIC world, they have not been studied thoroughly for FPGA implementations.

We start by investigating the challenges of implementing a typical inorder pipeline on FPGAs and propose effective solutions to shorten the processor critical path. We then show that superscalar processing is undesirable on FPGAs as it leads to low clock frequency and high area cost due to wide datapaths. Accordingly, we focus on investigating and proposing FPGA-friendly OoO and Runahead soft processors.

We propose FPGA-friendly alternatives for various mechanisms and components used in OoO execution. We introduce CFC, a novel copy-free checkpointing which exploits FPGA block RAMs for fast and dense storage. Using CFC, we propose an FPGA-friendly register renamer and investigate the design and implementation of instruction schedulers on FPGAs.

We then investigate Runahead execution and introduce NCOR, an FPGA-friendly non-blocking cache tailored for FPGAs. NCOR removes CAM-based structures used in conventional designs and achieves the high clock frequency of 278 MHz. Finally, we introduce SPREX, a complete Runahead soft core incorporating CFC and NCOR. Compared to Nios~II, SPREX provides as much as 38% higher performance for applications with irregular data-level parallelism with minimal area overhead.

http://hdl.handle.net/1807/65627

Identifer	oai:union.ndltd.org:TORONTO/oai:tspace.library.utoronto.ca:1807/65627
Date	14 July 2014
Creators	Aasaraai, Kaveh
Contributors	Moshovos, Andreas
Source Sets	University of Toronto
Language	en_ca
Detected Language	English
Type	Thesis

Page generated in 0.0022 seconds

High Performance Soft Processor Architectures for Applications with Irregular Data- and Instruction-level Parallelism

Description

Links & Downloads

Tags

Additional Fields